
Instagram & TikTok Scraper v1.0.0 First stable release of the Instagram & TikTok Scraper for academic research. Features Instagram scraper: Playwright-based with API interception and cursor pagination TikTok scraper: yt-dlp with browser impersonation and carousel reconstruction Language detection: Automatic via lingua-py (95+ languages) CSV export: 23 standardized variables per post Screenshot capture: Embedded post format for both platforms Carousel/slideshow reconstruction: MP4 generation via ffmpeg (both platforms) Configurable study periods and rate limiting Requirements Python 3.10+ ffmpeg Playwright (Chromium) Documentation Full usage guide in README Paper included (paper.md, paper.bib) 46 unit tests License GPL-3.0
If you use this software, please cite it using these metadata. Proper citation helps acknowledge the work and ensures academic integrity.Beyond academic rules, citing software is a matter of ethical research conduct. Developing, maintaining, and providing open access to technical tools requires a massive investment of time and expertise. Using these resources without attribution—or claiming them as part of your own technical pipeline—is a bad practice that undermines the "Open Science" ecosystem.Failing to credit the author is not just an oversight; it’s a lack of respect for the intellectual labor that makes your data collection possible. Let’s build a culture of transparency where we value the tools that drive our discoveries.
tiktok, social-media, instagram, yt-dlp, scraper, academic-research, playwright, data-collection
tiktok, social-media, instagram, yt-dlp, scraper, academic-research, playwright, data-collection
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
