Getting Started

Utilize R for your mixed model analysis

In most cases, data tends to be clustered. Hierarchical Linear Modeling (HLM) enables you to explore and understand your data and decreases Type I error rates. This tutorial uses R to demonstrate the basic steps of HLM in social science research.

Credit: from Pixabay

Before beginning the analysis, let’s briefly talk about what is HLM first.

HLM (AKA multilevel modeling) analyzes data that is clustered in an organized pattern(s), such as universities in states, non-white males in tech companies, and clinics in hospitals. HLM is an ordinary least square (OLS) that requires all assumptions met (check out my tutorial for OLS assumption and…

The first important step before analyzing your data is getting to know the data. This tutorial uses R and focuses on basic data screening principles, such as checking outliers and testing general linear model assumptions before beginning data analysis.


Case study: The data for this tutorial is fictitious and about psychosocial traits and abusive relationships. Let’s say I am interested in examining the roles of adverse childhood experience (ACE) and borderline personality traits (BPT)(e.g., extreme fear of abandonment, emotional dysregulation) in predicting one’s likelihood to experience an intimate abusive relationship in their adulthood. I will also consider the roles of demographic variables, such as age, race, gender, educational level, intelligence, and income, because those factors may impact one’s intimate relationships. According to the scenario, the independent variables are two continuous variables (childhood trauma exposure and BPT), and the outcomes…

Image credit: Pixabay

Have you ever interested in what a science organization has recently discussed on Twitter? Have you ever been curious about how Twitter users have recently talked about a given topic? If you aim to answer those questions, this tutorial is for you.

Before answering the questions, you will need Python libraries and commands associated with data scraping and Twitter Application Programming Interface (API).

Pandas are used for data analysis and manipulation in Python, especially regarding numeric and time-series data.

Tweepy is based on the Twitter Application Programming Interface (API) that allows users to compose tweets, access their followers’ data, examine a large number of tweets on a given topic, and read twitters’ profiles.

Time is a python library used for presenting time as codes and variables, such as string, numeric, and objects.

Import the required…

What are word clouds? Word clouding is a visualization technique to present a list of online words related to a given topic or hashtag. Below is how a word cloud looks like, says, if someone wants to visually generate a list of words associated with the hashtag “data science” on Twitter:

Word clouds can be easily generated using free online generators. However, to use the method, you likely must have a list of words and their frequencies (e.g., how often each word has been mentioned on Twitter). Manually counting words associated with a hashtag on Twitter can be laborious and unpractical. This tutorial shows how to create a word cloud, using Python to generate a list of words associated with your hashtag of interest on Twitter.

What you need before getting started:

Tweepy is based on the Twitter Application Programming Interface (API), allowing users…

Kay Chansiri

Communication psychology researcher, data scientist.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store