Transmitting Science has organized a new course entitled “Web Scraping and Text Analysis in R”.
Online live sessions on March 21st, 28th and April 4th, 11th, 2025, from 15:00 to 18:30 (Madrid time zone).
Course overview
Researchers often require access to large datasets from various sources, such as biodiversity databases, environmental monitoring websites, or online repositories. Manually collecting such data is time-intensive and inefficient. Web scraping can automate the extraction of scientific data from public databases, scientific publications, and organisational websites. For instance, one can scrape weather data or geological survey results to analyse trends or share findings with collaborators.
Establishing networks and promoting lifelong scientific learning often require analysing large volumes of text, such as conference abstracts, published papers, or grant announcements, to identify trends, common research interests, or potential collaborators. Text mining and Natural Language Processing (NLP) techniques can process large text datasets to identify key topics or perform Sentiment Analysis to assess public or academic opinions on certain scientific issues.
Key Highlights
Web Scraping Techniques: Learn how to extract data from websites using R packages designed for processing HTML.
Natural Language Processing (NLP): Gain an introduction to NLP concepts, including text mining, tokenization, and sentiment analysis.
Hands-On Learning: Engage in practical exercises to scrape review data from online platforms and perform advanced text manipulations.
This course includes a range of activities such as web-scraping demos, live-coding sessions, interactive quizzes, and practical exercises to work individually or in a group. Active participation and contribution are recommended.
Places are limited to 15 participants.
For more information and registration please check the course webpage: https://www.transmittingscience.com/courses/statistics-and-bioinformatics/web-scraping-and-text-analysis-in-r/ or write to [email protected]