Enter a news article's URL to assess its credibility:
Our goal was to build a model to discern the credibility of an article based solely on its textual content.
Collect news articles from a set of credible and non-credible websites. Get training labels from OpenSources, a professionally curated database.
Sample from our corpus in such a way that the training set contains an even number of unique articles from both credible and non-credible sources for each day of data collection.
Build an ensemble classifier that considers the predictions of two separate models:
1. "Content-only" model (Multinomial Naive Bayes)
2. "Context-only" model (Adaptive Boosting)
Each classifier is retrained daily and subjected to cross validation testing to obtain updated accuracy scores. These scores are used to update weights in the final ensemble classifier.
A more detailed discussion of how we handled specific challenges throughout the course of this project can be found on
Medium.
Meet the team behind the project. Graduates from UC Berkeley's Masters in Information & Data Science program, we are keen on developing machine learning solutions to real world problems.