I whole-heartedly agree with the author that we need to seriously consider the ethical implications of all this new technology, especially when the window in which someone can “opt-out” of adopting it keeps shrinking. Personally, I’m not interested in owning a self-driving car–I actually like the experience of driving–but eventually, I’m not going to have a choice.
Since its inception around the turn of the 20th century, researchers have used classical statistics to analyze data sets. In general, the focus has been on analyzing a sample of the data, then generalizing the findings to the entire population. This was born of necessity, as until very recently, the technology hasn’t existed to allow people to analyze entire data sets containing millions of data points.
Because only a sample of the data is analyzed, statisticians spend a great deal of time ensuring that the assumptions allowing the sample results to be generalized are met. At least, they try to do so: in practice, statistical modelling is an exercise in how many assumptions you can violate without compromising your analysis; as a result, the methods that get used are not necessarily the most powerful, but the ones that are most robust to these violations.
On the other hand, machine learning models are typically validated by testing model performance against a holdout sample of data points that aren’t used to estimate the model’s parameters. Models that don’t perform well on the holdout sample are discarded. This helps guard against over fitting, because the model will only perform well if it contains those features that have predictive value for the entire data population.
To the extent possible, researchers using classical statistics should also use this type of external validation. This is true even when model assumptions seem to be met, as the data set itself represents just one point in time. When the number of data points is small, cross-validation methods, which test a series of models against a very small holdout sample (often one case) can be used. Since these types of models are often parsimonious, they might also be validated against similar data sets, with an aim to only including those features that maintain their predictive power across each data set tested.
In addition, researchers using machine learning should verify their model assumptions are met. This can be easy to overlook, because the assumptions are often less stringent than in classical statistics. It’s also tempting to focus on the model’s performance against the holdout sample as “proof” that it is correct: this is a potentially serious error, as it may not reveal systematic deviations from the assumptions that exist in both the training and validation data.
The foundation of classical statistics is largely due to Sir Ronald Fisher’s early agricultural experiments, conducted in the early 20th century. While his methods continue to be invaluable, it’s important to remember that they were designed for a world where the analysis was conducted by hand on a small set of data points. That’s not usually the case today: as a result, it’s critical to draw from the best insights of both classical statistics, and contemporary analytics, to develop accurate and reliable models for today’s world.
When I was in graduate school, we debated the merits of a centralized data store that policy makers could use to make better decisions; ultimately, we decided the risks to privacy outweighed the benefits.
Data collected by (ethical) businesses is de-identified, typically by assigning each case with an arbitrary number. Government data isn’t, although as the article below points out, it could be. The bigger concern is that, unlike private organizations, the government can detain, arrest, and even execute people. On the one hand, none of these things happen without due process; on the other, power corrupts, and–what is far more worrying–people make mistakes. Are we willing to accept that?
We might be, if it actually does lead to better policy: having worked for the government, I can tell you that we routinely made decisions on what I’ll call sparse information. Several times, I had to request data from another state agency, and each time we had to draft an agreement specifying precisely what my agency could do with it. And there’s no data standardization across agencies, so sometimes after going through all this, I wasn’t able to merge the two data sets.
Which raises another issue: to make this work, each agency would have to use the same data semantics, file structure, and database application. Even in the ideal case, where everyone can agree on a common data dictionary, each agency’s ability to contribute data will be limited by its own architecture. And, as the second link makes clear, things are usually not ideal.
Here is an interesting take on the evolving distribution of work between humans and artificial intelligence. We need to begin dealing with AI as an actual form of intelligence, of a different nature than ours, and with its own strengths and weaknesses. True, machines may not actually think (yet), but for specialized tasks such as medical diagnosis, they are beginning to outperform experts that do. Yet, unlike human experts, we have no way to judge the machine’s credibility: as the article notes, that will take fundamental changes in how businesses organize and complete their work.