Skip to Main Content

Text-mining and Analysis in Digital Scholarship Research: Learning Resources

Library Online Tutorials

DS Online Tutorials: Learning R Programming and Data Analysis with Fun

The Digital Scholarship Team has collaborated with Dr. Jacqueline Wong from Decision Sciences and Management to produce a series of online tutorials on Learning R Programming for Data Analytics to facilitate teaching and learning. The project is funded by Courseware Development Grant Scheme (2019-22) of CUHK.

The online tutorial has 10 short videos (Playlist):

  1. Introduction of Data Analysis
  2. Start to Use R
  3. Basic Commands I & II
  4. Data Visualization I (Scatterplot)
  5. Data Visualization II (Line Chart)
  6. Data Visualization III (Bar Chart)
  7. Data Visualization IV (Pie Chart)
  8. Regression
  9. Further Learning Resources
DS Online Tutorials: R Starter Pack II

This is series 2 in learning R programming - R Starter Pack II. The project is collaborated with Dr. Jacqueline Wong from Decision Science and Managerial Economics and was funded by Courseware Development Grant Scheme (2019-22) of CUHK.

The online tutorial has 10 short videos (Playlist):

  1. Introduction of Data Cleaning
  2. Introduction to tidyr & dplyr
  3. Handling Missing Values
  4. Manipulate Cases
  5. Grouping and Summarizing
  6. Date and Time
  7. Hands On Project I
  8. Hands On Project II
  9. Hands On Project III
  10. Database Suggestions

Learn Data Visualisation in Videos

Python Learning Material - by PyOhio (2018.07.30)

Walk through an example in Jupyter Notebook that goes through all of the steps of a text analysis project, using several NLP libraries in Python including NLTK, TextBlob, spaCy and gensim along with the standard machine learning libraries including pandas and scikit-learn.

 

KNIME Learning Material - by KNIMETV (2020.06.26)

Walk through a demo of some common text analytics techniques using KNIME Analytics Platform.

 

R Learning Material - by Data Science Dojo (2017.06.06)

This data science training provides introductory coverage of the following tools and techniques: – Tokenization, stemming, and n-grams – The bag-of-words and vector space models – Feature engineering for textual data (e.g. cosine similarity between documents) – Feature extraction using singular value decomposition (SVD) – Training classification models using textual data – Evaluating accuracy of the trained classification models.

Online Tutorials & Guides

Books in CUHK Library

General Understanding of Text Mining and Language Processing

Text Mining and Language Processing in Python