DSC Weekly Digest 17 August 2021

Author: Kurt A Cagle

Asking the Right Questions


Announcements
  • The secret to successful voice technology is inclusiveness. The more people your model can understand, the more likely you are to acquire and retain customers. Test how well your speech recognition understands nonnative English speakers with our this free 9-hour dataset, valued at $1350, from DefinedCrowd. Get your free dataset here

Asking the Right Questions

As data systems become more complex (and far-reaching), so too does the way that we build applications. On the one hand, enterprise data no longer just means the databases that a company owns, but increasingly refers to broad models where data is shared among multiple departments, is defined by subject matter experts, and is referenced not only by software programs but complex machine learning models.

The day where a software developer could arbitrarily create their own model to do one task very specifically seems to be slipping away in favor of standardized models that then need to be transformed into a final form before use. Extract, transform, load (ETL) has now given way to extract, load, transform (ELT). There’s even been a shift in best practices in the last couple of decades, with the idea that you want to move core data around as little as possible and rely instead upon increasingly sophisticated queries and transformation pipelines.

At the same time, the notion is growing that the database, in whatever incarnation it takes, is always somewhat local to the application domain. The edge is gaining in intelligence and memory, indeed, most databases are moving towards in-memory stores, and caching is evolving right along with them.

The future increasingly is about the query. For areas like machine learning, the query ultimately comes down to making models so that they are not only explainable, but tunable as well. The query response is becoming less and less about single the answer, and more about creating whole simulations.

At the same time, the hottest databases are increasingly graph databases that allow for inferencing, the surfacing of knowledge through the subtle interplay of known facts. Bayesian analysis (in various forms and flavors) has become a powerful tool for predicting the most likely scenarios, with queries here having to straddle the line between utility and meaningfulness. What happens when you combine the two? I expect this will be one of the hottest areas of development in the coming years.

SQL won’t be going away – the tabular data paradigm is still one of the easiest ways to aggregate data – but the world is more than just tables. A machine learning model, at the end of the day, is simply an index, albeit one where the keys are often complex objects, and the results are as well. A knowledge graph takes advantage of robust interconnections between the various things in the world and is able to harness that complexity, rather than get bogged down by it.

It is this that makes data science so interesting. For so long, we’ve been focused primarily on getting the right answers. Yet in the future, it’s likely that the real value of the evolution of data science is learning how to ask the right questions.

In media res,

Kurt Cagle
Community Editor,
Data Science Central

To subscribe to the DSC Newsletter, go to Data Science Central and become a member today. It’s free! 


Data Science Central Editorial Calendar

DSC is looking for editorial content specifically in these areas for July, with these topics having higher priority than other incoming articles.

  • MLOps and DataOps
  • Machine Learning and IoT
  • Data Modeling and Graphs
  • AI-Enabled Hardware (GPUs and similar tools)
  • Javascript and AI
  • GANs and Simulations
  • ML in Weather Forecasting
  • UI, UX and AI
  • Jupyter Notebooks
  • No-Code Development
  • Metaverse

DSC Featured Articles



Picture of the Week
Data Literacy Skills
Data literacy skills

 


To make sure you keep getting these emails, please add mail@newsletter.datasciencecentral.com to your browser’s address book.

This email, and all related content, is published by Data Science Central, a division of TechTarget, Inc.

275 Grove Street, Newton, Massachusetts, 02466 US


You are receiving this email because you are a member of TechTarget. When you access content from this email, your information may be shared with the sponsors or future sponsors of that content and with our Partners, see up-to-date  Partners List  below, as described in our  Privacy Policy . For additional assistance, please contact:  webmaster@techtarget.com


copyright 2021 TechTarget, Inc. all rights reserved. Designated trademarks, brands, logos and service marks are the property of their respective owners.

Privacy Policy  |  Partners List

Go to Source