Data Synthetics without A.I. and Why This Adds Value to You as an Individual
Data Synthetics is a term I coined to refer to the framework/processes related to synthesizing data (instead of just analyzing it). It's by far the most significant thing in data science today and one of the many applications of A.I.; namely, specialized systems generating data based on a given dataset, all while maintaining the properties of the original dataset. But isn't there an abundance of data out there? Well, yes, but we could always use some more. This rationale is much like the work of a fiction writer. The latter often fancies creating her own characters for a novel or a short story even though there are plenty of real-world characters out there she could copy and include in her text. So, if you don't want to be part of someone else's work of fiction (especially if that gets published and read by many other people), you may want to abstain from having your personally identifiable information (PII) roaming free in the world. Part of that information you may be unable to change (e.g., health-related PII, aka PHI) so, protecting it is of paramount importance.
Data synthetics can do this for you by creating new data very similar to existing data, thereby creating an unbridgeable gap between your PII and the data that is used by a predictive model, for example. This similarity can also help make these predictions relevant to you since the general underlying pattern (aka, the signal in the data) remains the same.
Plenty of brilliant A.I. professionals, be it scientists or engineers, have delved into this problem and have come up with mathematically elegant solutions. One such solution is Variational AutoEncoder (VAE, link to a comprehensive and somewhat comprehensible article on this topic), a kind of artificial neural network (ANN) that aims to figure out the underlying distributions of the data and create new data based on them. These distributions are a mathematical model aiming to describe the signal. Not the only one and probably not even the best one either, but it's good enough for something basic. The problem with VAEs (and other A.I. systems) is that they need sufficiently large datasets to figure out this signal and manifest it in new data. Additionally, building a VAE isn't so simple unless you understand the technology and the not-so-trivial math involved.
What if there was a way to develop synthetic data without utilizing A.I.? What if all you needed to know was the Math you learned in school and a few other things based on that Math, elegant but not overly sophisticated? Well, that's what I've done recently with sufficient success to consider this something usable and useful. This framework (which I call ROOF, hence the picture on the top) I developed in Julia 1.5, is low on computational resources and can be applied to any kind of continuous data (there is also a version for ordinal data though I imagine that's not something you care about that much). If you are in this sort of work or know someone who is, feel free to reach out to me. Cheers!
Articoli di Zacharias 🐝 Voulgaris
Visualizza il blogThis article is not a promotional one, even if it may seem like one. It's not an academic one either ...
Overview · Lately, many professionals in the data world offer mentor and consult services. Oftentime ...
My team and I are working on an educational venture for data matters. Nothing too technical but some ...
Potresti essere interessato a questi lavori
-
Senior Associate- Corporate/M&A- Milano Leg
1 settimana fa
Pwc South Africa Milano, ItaliaJob Description & SummaryPwC TLS Avvocati e Commercialisti è lo studio professionale member firm del network PwC per la consulenza legale e tributaria. Una delle principali realtà professionali del Paese, porta per connettersi al network internazionale e leader a livello globale. ...
-
Bottega Veneta Logistica S.r.l. Trissino, Italia A tempo pienoDescription · Bottega Veneta – inspiring individuality with innovative craftmanship since 1966. Creativity lies at the heart of all that we do. Born in Vicenza the house is rooted in Italian culture yet maintains a truly global outlook. An inclusive brand with exclusive products ...
-
Consulente SAP SCM PP
5 ore fa
IBM Barano d'Ischia, ItaliaIntroduction · In questo ruolo avrai l'opportunità di lavorare in uno dei nostri IBM Consulting Client Innovation Center (Delivery Center), dove forniamo una profonda esperienza tecnica e di settore ad un'ampia gamma di clienti del settore pubblico e privato in tutto il mondo. I ...
Commenti