Spring 2023 Workshops

LATIS offers a series of workshops that are free and open to all faculty and graduate students. Join our LATIS Research Workshops Google Group to be the first to learn about workshops. You can view the videos, slides, and materials from past workshops at the LATIS Workshop Materials website.

Workshops are also offered on even more topics from partner departments:

  • DASH (Digital Arts, Sciences, and Humanities)
  • MSI (MN Supercomputing Institute)

Spring 2023 LATIS Workshop Schedule

Workshops will be a mix of in-person and online formats. Click on the links below for a detailed description of each workshop.

Register here for one or more workshops! 

Feb 3 | 10:00am-noon

Introduction to Computational Text Analysis in Python


Feb 10 | 10:00am-noon

Introduction to Git and GitHub


March 17 | 10:00am-noon

Visualizing Data with ggplot2 Part 1: The Basics


March 24 | 10:00am-noon

Visualizing Data with ggplot2 Part 2: The "Additions"


April 7  | 10:00am-noon

Advanced NVivo

In Person
April 21 | 10:00am-noon Creating and Updating Research Databases with SQL In Person
Asynchronous  + Consultation Managing Data When You Graduate Online


Register today! 

Asynchronous Workshops

We also offer asynchronous workshops in canvas that you can take at your own pace. Please contact us [email protected] with any questions or trouble enrolling. Click on the links below for a detailed description of each workshop.

Available anytime Introduction to Survey Sampling Enroll Now
Available anytime Qualtrics - Tutorials Enroll Now
Available anytime Working with data in R - Tutorials Enroll Now
Available anytime Linux for Research Computing Enroll Now

Workshop Descriptions

Introduction to Computational Text Analysis

Scholars in humanities and social science fields are using computational tools to explore large corpora of digital texts. This hands-on workshop will introduce some common methods such as topic modeling and sentiment analysis, as well as fundamental cleaning and processing tasks for a text analysis workflow in Python.

This workshop will cover how to:
  • Read and write text files in Python
  • Manipulate ‘strings’ of text
  • Pre-process text for analysis (basic cleaning tasks such as normalizing case, stripping punctuation and whitespace, etc)
  • Count word frequencies
  • Create a document term matrix (a ‘bag of words’)
  • Build topic models and conduct sentiment analysis

This workshop will also briefly introduce concepts and tools related to other common computational text analysis tasks: regular expressions (regex) and text cleaning, string matching and fuzzy matching, NLTK tools such as named entity recognition and parts-of-speech tagging, word embeddings (word2vec), classification tasks (e.g., stylometry, genre identification…)

To be successful, you should have:


Introduction to Git and GitHub

GitHub is a web application for hosting, sharing, and tracking digital assets like source code and datasets.  GitHub, and the git family of tools, keep track of changes to your files as you work and provide easy ways to integrate changes from multiple people.  If you’ve ever found yourself making files named “copy_copy_final” and “copy_copy_real_final”, Git is for you.

This workshop will cover how to:
  • Create a repository with the University-provided github.umn.edu website
  • Use Git Desktop or the Git command line interface to track files on your own computer and push them up to GitHub. 
  • Use Git to manage revisions and collaborate with team members 
To be successful, you should have:
  • A computer with Git Desktop installed. There will also be an online environment available using the Git command line tools should you not wish, or are unable, to install Git Desktop.
  • A University of Minnesota Internet ID

Visualization with ggplot2 

The first part of this two part workshop will introduce the logic behind ggplot2 and give users hands-on experience creating data visualizations using this package within R. R is a powerful tool for statistical computing, but its base capabilities for graphics can be limited, and complicated plots often require a considerable amount of code. Ggplot2 is a popular package that extends R’s capability for data visualization, allowing users to produce attractive and complex graphics in a relatively simple way. Day 1 will cover the basics of ggplot2, while Day 2 will cover more advanced topics in customizing the look of graphs for publication or presentation, adding multiple elements to graphs, and grouping plots together. We strongly recommend attending Day 1 before attending Day 2 unless you are already very familiar  with the basics of ggplot2 (i.e. you understand the elements outlined in the Day 1 description below).

Day 1  (“The Fundamentals” on 3/17/2023) of this workshop will cover:
  1. The basics of the "grammar of graphics" underlying ggplot2's functionality
  2. How to create a variety of reproducible data visualizations in R, such as histograms, line charts, scatter plots, heatmaps, and density plots
  3. Multiple ways to visualize data by groups, including color labeling and faceting 
Day 2 (“The Embellishments” on 3/24/2023) of this workshop will cover: 
  1. How to adjust colors, shapes, legends, and axes
  2. Using and customizing themes to adjust the look and feel of the graph
  3. Combing graphs together using packages such as cowplot
  4. How to reproducibly save graphs for publication. 

To be successful, you should:

  • Have R and RStudio installed on a computer you can use for this workshop
  • Have basic familiarity with R, but no prior experience with ggplot2 is required.


Advanced NVivo

NVivo is a qualitative data management, coding and markup tool, that facilitates powerful querying and exploration of source materials for both mixed methods and qualitative analysis. The software is provided for faculty and graduate students of the College of Liberal Arts and College of Education and Human Development. This workshop introduces the advanced functions of NVivo, with basic knowledge of NVivo recommended.

This workshop will cover
  • A brief review of adding and managing source materials and codes
  • Creating classifications & attributes (variables) with demographic data and importing them from Excel
  • Organizing materials into “cases” to facilitate comparison
  • Using “auto-coding” to segment transcripts and other structured text
  • Complex queries with codes and concepts subset by attributes, cases, or sources
  • Running the built-in interrater reliability metrics
  • Importing data from other software including Qualtrics, OneNote, and Zotero
  • Exporting frequencies and code counts to statistical packages
To be successful, you should
  • Have a basic understanding of qualitative research methods
  • Be familiar with NVivo’s interface and basic functions
  • Install NVivo from z.umn.edu/getNVivo prior to the session, or install a trial from QSR International’s website

Creating and Updating Research Databases with SQL

In Introduction to SQL and Research Databases (Spring 2022), participants learned about the Structured Query Language (SQL) and how to write queries that create, read, update, or delete data from an existing database (the so-called CRUD operations). In this installment of the research databases series, we will expand upon CRUD operations and start introducing the tools needed to create a database from scratch. Topics covered include: questions to ask when choosing a database technology; describing the fundamental building blocks of schema design; creating an Entity Relationship Diagram (ERD); introducing SQL's Data Definition Language (DDL); optimizing your database for faster access and computations; and the basics of how to connect to your database from within your code. Prerequisite: we highly encourage participants to attend Introduction to Research Databases and SQL, or have sufficient knowledge of CRUD operations, prior to attending this session.

This workshop will cover:
  • Basic database design: What are tables, relations, indices, etc.
  • Data Definition Language (DDL): How to create tables, indices, etc.
  • Scripting and SQL: Writing scripts to access, view and manipulate data
  • Intro to optimizing database design and queries
To be successful, you should have: 
  • A laptop to bring to the workshop:
  • Optional: Install Python on your laptop (we recommend Anaconda).
  • There will be an online environment available for using Python or R, so local installation on your laptop is not required.
  • Knowledge sufficient to write queries that create, read, update, and delete data in SQL. If you do not already have this knowledge, you can review our Intro to SQL workshop recording; there are also many online resources, such as Codecademy, that walk through the basics of SQL. 
  • An intro-level familiarity with the Python programming language and/or R


Managing Data When you Graduate 

Research and creative work doesn't end with degree completion; however, access to many of the data storage tools and software that have supported that work changes when students become alumni. This workshop will help graduate students navigate questions about whether they can take their data and materials with them when they leave the university, and if so, how to do it. The workshop will be presented through a combination of online, asynchronous materials and smaller consultations to help students make a plan for their data and materials after graduation and beyond.  Materials will be available on a canvas site in mid-March and individual consultations will be scheduled for early April. This workshop is co-organized by the University Libraries. 

The asynchronous materials will cover:
  • The University policies that guide ownership of data
  • Access changes to storage, software, and services that happen upon graduation 
  • Strategies and tips for ensuring data are accessible and understandable long after graduation
The small consultations will cover:
  • How to make a plan to ensure a smooth transition for your data and materials between graduate school and your next endeavor
  • Specific advice and troubleshooting for your own research and situation. 
To be successful, you should:
  • Be a graduate student at the University of Minnesota at least a year into your program (it never hurts to plan early!), or who is nearing the end of your program. 
  • Have a research project (part of a dissertation or thesis) that has generated data or materials that you want to keep track of after you leave. This can include collaborative projects that will continue at UMN after graduation.

    Introduction to Survey Sampling (Canvas Module)

    This is an interactive, self-paced Canvas course, designed for those who are either 1) completely new to surveying or 2) have never had formal instruction in survey/sampling design. By the end of course, you should be able to: 

    1. Differentiate between a census and a sample
    2. Describe features and limitations of common sampling methods
    3. Recognize different sources of survey error/bias
    4. Describe how different sources of survey error/bias affects the conclusions you can draw with your survey

    This brief, introductory course to sampling is designed to take around 1-3 hours to complete, depending on the material you choose to engage with.


    Qualtrics Tutorials (Canvas Modules)

    We have three asynchronous Canvas courses available for you to take: 

    1. Introduction to Qualtrics: Are you brand new to using Qualtrics? Or has it been a really long time since you used Qualtrics? Start here to learn the ropes. [Expected time: 1 hour]
    2. Qualtrics Data Integrity & Management: No matter if you are new to Qualtrics or a long-time user, this module is a must for any Qualtrics user who is interested in 1) how to make Qualtrics data more readable and suitable to their needs, 2) best practices for conducting reproducible research within Qualtrics (e.g., sharing and archiving survey information, how to export data reproducibly, etc.). [Expected time: 35-45 minutes]
    3. Designing Experiments & Complex Surveys in Qualtrics: Sometimes figuring out the right bells and whistles for more complex research designs in Qualtrics can be daunting. If you’re looking to build complex surveys or experimental tasks within Qualtrics, this tutorial is for you! We cover how to use some more complex functionality within Qualtrics, such as the using the survey flow, branching logic, embedded data, embedded media, piped text, “loop & merge”, integration with MTurk/Prolific, and more! In this module, you will watch a video walkthrough from our Fall 2021 workshop. [Expected time: 10-20 minutes for Canvas content; 2 hours of video content]

    Working with Data in R - Tutorials (Canvas Module)

    R is a popular tool for data analysis and statistical computing, and is a great alternative to tools like SPSS, Stata, or Excel. R is designed for reproducible research and can be used for many parts of the research process besides statistical analysis. This asynchronous course includes introductory readings, videos, and activities to build on and advance your data skills in R. 

    Topics include

    1. Foundations in R: Just starting in R? Welcome! This module will walk you through the basics of R and set the foundation for the more advanced modules below. 
    2. Publication worthy graphs with ggplot2: Learn how to adjust colors, axises, legends, and themes, as well as how to reproducibility save graphs for publication. 
    3. Create a table using dplyr: Learn how to aggregate data and create summaries for tables for publication. 
    4. Reshaping data: Data are not always in the right format for analysis or visualization. Learn how to transform data from wide to long format and back again. 
    5. R Markdown: Combine code, output, and text into readable documents with R Markdown. Learn how to create a basic R markdown document for research. 
    6. Working with Qualtrics data in R: Qualtrics is a popular tool for survey research, but the resulting data often require cleaning before analyzing in R. Learn how to efficiently clean Qualtrics data for use in R, including how to reproducibly remove the multiple headers, save labels, and combine multi-response columns. 

    Linux for Research Computing

    This asynchronous course is a gentle introduction to command line programming using Linux. It is designed for CLA researchers and students who need to use high performance computing resources for their work (for example, to run fMRI analyses, parallel computing, or large scale analyses), but have little to no experience with Linux. 

    This course guides participants through:

    1. Connecting to the CLA compute cluster
    2. Navigating directory and file structure using the Linux command-line terminal
    3. Creating, modifying, and moving files using the Linux command-line terminal
    4. Submitting an interactive and a batch computing job and understanding when it is beneficial to use one or the other