Close Menu
DISADISA
  • Home
  • News
  • Social Media
  • Disinformation
  • Fake Information
  • Social Media Impact
Trending Now

Russia Disseminates Disinformation Regarding Alleged Polish Drone Strikes

September 12, 2025

Naval Academy Student Injured Due to Dangerous Misinformation

September 12, 2025

Microsoft Averts EU Antitrust Fine Through Teams Unbundling Agreement

September 12, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram YouTube
DISADISA
Newsletter
  • Home
  • News
  • Social Media
  • Disinformation
  • Fake Information
  • Social Media Impact
DISADISA
Home»News»Multilingual Video Analysis in Documentation
News

Multilingual Video Analysis in Documentation

Press RoomBy Press RoomJanuary 7, 2025No Comments
Facebook Twitter Pinterest LinkedIn Tumblr Email

Combating Misinformation on TikTok: Documented’s Investigative Toolkit

The rise of social media platforms like TikTok has presented new challenges in combating the spread of misinformation, particularly among vulnerable populations like migrants. Documented, a non-profit news organization focused on immigrant communities in New York City, has taken on this challenge by developing a sophisticated, multi-pronged approach to investigating and exposing misinformation targeted at migrants. This article details their innovative methodology, offering valuable insights for journalists and researchers grappling with similar issues.

Documented’s investigation stemmed from firsthand accounts of migrants relying on TikTok for information about navigating their journey to New York City, often encountering misleading or false information. This prompted the organization to delve deeper into the platform, working with community correspondents and experts to identify prevalent misinformation themes, including predatory scams targeting migrants. This initial groundwork provided a crucial foundation for the subsequent technical investigation.

A significant hurdle in tackling online misinformation is the ephemeral nature of the content. Disinformation campaigns often appear and vanish quickly, requiring swift action to preserve evidence. Documented addressed this by developing a system for identifying and archiving relevant TikTok accounts. The process involved collaborating with experts and migrants to pinpoint accounts spreading misinformation, followed by the development of a Python scraper to extract video URLs from archived HTML pages of these accounts. Using the yt-dlp library, the videos and their metadata were then downloaded and stored locally.

Analyzing large volumes of video content presents another significant challenge. Manually reviewing each video is time-consuming and impractical. To overcome this, Documented employed the open-source Whisper speech recognition model to automatically transcribe the videos. While the accuracy of the transcription varied across languages, it provided sufficient information to gain a general understanding of the content and identify key themes for further investigation. The use of AI tools like Whisper, while imperfect, proved invaluable in managing the sheer volume of data. The organization acknowledged the limitations of these tools and emphasized the importance of contextualizing and verifying the information extracted through automated processes.

To further refine their analysis, Documented leveraged natural language processing (NLP) and topic modeling. NLP facilitated the conversion of transcribed text into analyzable data, allowing for the identification of frequently occurring words and phrases. Topic modeling, a form of unsupervised machine learning, helped cluster related words and uncover underlying themes within the videos. This combination of techniques allowed researchers to identify recurring topics, such as religious misinformation and issues surrounding the CBP One app, which migrants use to enter the U.S. These identified themes guided further investigation and provided a framework for understanding the broader landscape of misinformation targeting migrants.

Documented’s approach highlights the importance of combining macro-level analysis with micro-level examination. While the automated analysis provided a broad overview of the content, the team also carefully reviewed individual videos, particularly those with high viewership, to provide concrete examples and contextualize the larger trends. This combination of quantitative and qualitative analysis allows for a richer and more nuanced understanding of the misinformation ecosystem.

The technical infrastructure developed by Documented consists of a Python-based code pipeline that includes scripts for extracting video links, downloading videos, transcribing content, and performing topic modeling. By making this pipeline publicly available on GitHub, Documented aims to empower other journalists and researchers to conduct similar investigations. This open-source approach fosters collaboration and accelerates the development of effective strategies to counter online misinformation.

Documented’s investigation serves as a valuable case study in addressing the complexities of misinformation on platforms like TikTok. Their multi-faceted approach, combining technical innovation, community engagement, and rigorous analysis, offers a roadmap for tackling this growing challenge. By sharing their methodology and tools, Documented contributes to a broader effort to combat misinformation and empower vulnerable communities with accurate information.

The organization’s focus on archiving content, recognizing the fleeting nature of online misinformation, is a crucial step in ensuring accountability and enabling further analysis. The use of automated transcription, while imperfect, is a pragmatic approach to managing large volumes of video content. Furthermore, the combination of NLP and topic modeling provides a powerful framework for identifying thematic trends and patterns within the data.

The emphasis on combining macro-level analysis with micro-level examination of individual videos adds depth and context to the findings. By showcasing specific examples alongside broader trends, Documented provides a compelling narrative that resonates with audiences and underscores the real-world impact of misinformation.

The open-source nature of Documented’s toolkit is a testament to their commitment to collaboration and transparency. By sharing their code and methodology, they empower other organizations to adapt and refine these techniques, fostering a collective effort to combat misinformation.

Documented’s work highlights the evolving nature of journalistic investigations in the digital age. It underscores the need for adaptable methodologies, technical expertise, and a commitment to community engagement. By sharing their learnings and tools, Documented contributes significantly to the fight against misinformation and provides a valuable resource for journalists and researchers worldwide. Their work serves as a model for innovative, impactful reporting in the face of the complex challenges posed by online platforms and the spread of misinformation.

Their approach is not just about identifying and exposing misinformation, but also about understanding the context in which it spreads and the impact it has on vulnerable communities. By centering the experiences of migrants and working closely with community correspondents, Documented ensures their investigation is grounded in the realities faced by those most affected by misinformation. This community-centric approach is crucial for developing effective counter-strategies and building trust.

The organization’s recognition of the limitations of AI tools is also commendable. While embracing the potential of automated processes, they acknowledge the inherent biases and inaccuracies that can arise. Their emphasis on verifying information and contextualizing findings highlights a responsible approach to using AI in journalistic investigations.

Finally, Documented’s commitment to open-source principles contributes significantly to the fight against misinformation. By sharing their tools and methodology, they empower other organizations and individuals to conduct similar investigations, fostering a collaborative approach to tackling this global challenge. Their work serves as a powerful example of how innovative reporting, combined with technical expertise and community engagement, can make a tangible difference in combating misinformation and protecting vulnerable communities.

Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Tumblr Email

Read More

Naval Academy Student Injured Due to Dangerous Misinformation

September 12, 2025

US Naval Academy in Annapolis Placed on Lockdown

September 12, 2025

False Accusation of Toronto Retiree in Charlie Kirk Shooting Incident

September 12, 2025
Add A Comment
Leave A Reply Cancel Reply

Our Picks

Naval Academy Student Injured Due to Dangerous Misinformation

September 12, 2025

Microsoft Averts EU Antitrust Fine Through Teams Unbundling Agreement

September 12, 2025

ECI Workshop on Strengthening Media and Combating Election Misinformation

September 12, 2025

US Naval Academy in Annapolis Placed on Lockdown

September 12, 2025
Stay In Touch
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Vimeo

Don't Miss

Disinformation

EU Provides Disinformation Countermeasures Training to 125 Journalists and Other Professionals

By Press RoomSeptember 12, 20250

EU Bolsters Northern Nigeria’s Defenses Against Disinformation and Manipulation ABUJA, Nigeria – In a concerted…

Online Disinformation Identified as Component of Russian Hybrid Warfare Strategy.

September 12, 2025

False Accusation of Toronto Retiree in Charlie Kirk Shooting Incident

September 12, 2025

Senator Rubio to Visit Israel and the United Kingdom Amidst Heightened Tensions Following Israeli Strike on Qatar

September 12, 2025
DISA
Facebook X (Twitter) Instagram Pinterest
  • Home
  • Privacy Policy
  • Terms of use
  • Contact
© 2025 DISA. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.