BIG DATA POLL
Big Data Poll

Here’s the methodology behind how we do it. For media clients, we also provide all the required documents and information needed to meet the standards of the American Association for Public Opinion Research (AAPOR) Transparency Initiative (What’s this?).

Clients, journalists, researchers, and consumers should always refer to the AAPOR TI Checklist for detailed disclosures pertaining to a particular survey.

Panel Respondents

Interviews are conducted via the Big Data Poll National Internet Polling Panel and partners, which, in total boasts a reach in the U.S. of nearly 30 million. Interviews are NOT conducted on most national holidays and, depending on the survey method–random sample vs. opt-in–samples can include various percentages of repeat interviews from panelists, which we have demonstrated provides a more accurate gauge of shifting public opinion.

For BDP panels, respondents are recruited by numerous methods, including mailers, email, social media- or website-based advertisements, etc. Most panelists are recruited by record blocs pulled from voter files. We do not provide financial incentives to respondents.

Panelists also have the option to sign up for the panel, as with other Internet survey panels such as SurveyMonkey and YouGov. The difference, which accounts for the disparity in results, is our initial and likely voter screens.

During screening or initial interviews, respondents are asked to give their names, contact information (i.e. email and/or phone), as well as the city and zip code where they are registered or plan to register to vote. This is so we can attempt to verify registration status during deduping, to re-interview, obtain regional data etc.

We’ll attempt to contact a respondent for a repeat interview up to 10 times before removing them from the panel. The final 2 attempts are traditionally made through live caller or IVR software.

Samples

Big Data Poll conducts several different types of surveys — from phone-based random sample to opt-in online survey panel — depending upon clients’ needs, goals and objectives.

Random

We select a random sample of panelists to take part in our online surveys. For phone surveys, we pull a targeted random sample from our 224,000,000 million-strong voter files. They are constantly updated, deduped and cleaned.

Automated phone surveys and live interview surveys are conducted from Monday to Friday between the hours of 5:00 PM to 8:30 PM local time, state law permitting. On Saturday, interviews are conducted from 11:00 AM local time to 8:00 PM local time. On Sunday, survey interviews are conducted from 1:00 P.M. local time to 8:00 PM local time.

For online surveys, rather than randomly pulling and dialing a list of phone numbers from voter files, we instead randomly draw from a diverse panel of respondents. We ask our respondents at least about the demographics detailed below, whether they are registered to vote, what state they live in, etc. just as phone-based pollsters have done for years.

In these samples, we calculate a traditional margin of error (MoE).

Internet Opt-In Panel

In this sample, all responses are treated as “opt-in Internet panel” even though a percentage of respondents were specifically targeted based on registration status (more on that below in population). They are still ultimately considered opt-in and we do NOT treat them as a random sample.

In these samples, we use a bootstrap method with a standard 95% confidence interval (CI).

Population

For political & election surveys, top line results are of likely voters, or at least our best estimate of registered voters we view to be most likely to vote based on past voting history, enthusiasm and registration status.

We do not includes respondents who report they are not registered during initial interviews in the results, but also don’t immediately remove them as a potential panelist. They may register in the future and we view them as worthy of a follow-up and, in the past, we’ve found follow-up greatly reduces the probability of unintentionally excluding new voters of various ages.

Weighting

Big Data Poll is weighted for demographics such age, gender, race, income, education and region based on the Current Population Survey conducted by the U.S. Census Bureau and Bureau of Labor Statistics. For political surveys, registration targets are also obtained from the most recent Current Population Survey.

We use a proprietary likely voter model based on responses to screening questions relating to prior voting history, enthusiasm and registration status etc. Big Data Poll stresses casting a large, wide net at the start of a project and continuous tracking enables us to identify voters who are truly most likely to vote.

The poll does oversample primarily as a result of the entire sample being of registered voters and the use of the likely voter model. Because of how rigorously we screen, the disparity is not typically significant.

There has long been a debate among pollsters about whether to weight for party identification. Put simply, in 2016 we let the electorate tell us what it will look like, while other pollsters decided beforehand who they believed would vote on Election Day and adjusted accordingly.

Our philosophy is simple: That’s backward. If a pollster has a quality sample, then the electorate will speak to them if they are listening. They shouldn’t ignore what respondents are trying to tell them and they shouldn’t prejudge the electorate and allow their own biases to taint their methodology.

Margin of Error (MoE)

For random sample surveys, we calculate a standard margin of error (MoE) using the total known or estimated population size, sample size and confidence level (%).

Bootstrap Confidence Internal

No pollster can accurately estimate a traditional margin of error (MoE) if the sample is not a probability sample using a random selection. Though we don’t know exactly who will and will not respond to the panel for a given day in a survey, they are still treated as opt-in.

Instead, we use a bootstrap method with a standard 95% confidence interval. Admittedly, this can be a bit difficult with more than one or two choices, but we have had success in accounting for this in the past.

To reduce human error, we use StatKey to calculate the 95% confidence interval by generating 5,000 responses with the weighted results.