Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Text-mining and Analysis in Digital Scholarship Research: Learning Resources

Library Online Tutorials

DS Online Tutorials: Learning R Programming and Data Analysis with Fun

The Digital Scholarship Team has collaborated with Dr. Jacqueline Wong from Decision Sciences and Management to produce a series of online tutorials on Learning R Programming for Data Analytics to facilitate teaching and learning. The project is funded by Courseware Development Grant Scheme (2019-22) of CUHK.

The online tutorial has 10 short videos (Playlist):

  1. Introduction of Data Analysis
  2. Start to Use R
  3. Basic Commands I & II
  4. Data Visualization I (Scatterplot)
  5. Data Visualization II (Line Chart)
  6. Data Visualization III (Bar Chart)
  7. Data Visualization IV (Pie Chart)
  8. Regression
  9. Further Learning Resources
DS Online Tutorials: R Starter Pack II

This is series 2 in learning R programming - R Starter Pack II. The project is collaborated with Dr. Jacqueline Wong from Decision Science and Managerial Economics and was funded by Courseware Development Grant Scheme (2019-22) of CUHK.

The online tutorial has 10 short videos (Playlist):

  1. Introduction of Data Cleaning
  2. Introduction to tidyr & dplyr
  3. Handling Missing Values
  4. Manipulate Cases
  5. Grouping and Summarizing
  6. Date and Time
  7. Hands On Project I
  8. Hands On Project II
  9. Hands On Project III
  10. Database Suggestions

Learn Data Visualisation in Videos

Python Learning Material - by PyOhio (2018.07.30)

Walk through an example in Jupyter Notebook that goes through all of the steps of a text analysis project, using several NLP libraries in Python including NLTK, TextBlob, spaCy and gensim along with the standard machine learning libraries including pandas and scikit-learn.


KNIME Learning Material - by KNIMETV (2020.06.26)

Walk through a demo of some common text analytics techniques using KNIME Analytics Platform.


R Learning Material - by Data Science Dojo (2017.06.06)

This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models.

Online Tutorials & Guides

Books in CUHK Library

General Understanding of Text Mining and Language Processing

Text Mining and Language Processing in Python