I first came to know about this company when they visited a Career fair at USC, in February,  before COVID had begun. It was a start-up fair, and I remember there were a lot of people interested to know more about the company at the fair and rightly so. I was also curious to know as to how a biotech company can use AI in their work and so went on to know more about the company.

Their mission is to help people access, understand and benefit from the human genome. They use technology to decipher DNA and extract meaningful information from it, which can be used for various reasons. They are the only direct-to-consumer DNA test that includes 55+ health reports that meet FDA requirements.

They have a sample saliva collection kit which collects samples of saliva, which is used for further analysis and to give back reports. Once registered on their website, the company sends in a collection kit at your home, with instructions on how to collect the saliva and using the same box, we have to deliver it back to their laboratory. The collection kit consists of instructions, a tube to collect saliva, a funnel and a cap to seal off the tube after the saliva is collected. The tube is designed such that the DNA is preserved in the tube for a period of up to 6 months, in both hot and cold conditions, and hence can  be preserved over transport and delivered in an intact condition to their laboratory. They’ve sold more than 12M kits around the world till now.

During this on-going pandemic, 23&me also have some findings based on their research. Their preliminary data suggests that O blood type appears to be protective against the virus when compared to all other blood types. Individuals with O blood type are between 9-18% percent less likely than individuals with other blood types to have tested positive for COVID-19, according to the data. Also, Among respondents to the 23andMe COVID-19 survey, the percent of respondents reporting a positive test for COVID-19 is lowest for people who are O blood type. 

There are many teams at 23&me which help distribute the workload. The frontend manages their website and user interface, the backend manages the data and the ML team uses AI to extract and predict valuable information. AI is used a lot at 23&me, right from sending these kits, to converting it into data to be used in ML platforms, and then delivering the results.

ML is used to predict single nuclear type polymorphism and other variations in genome. AI is used to measure as many as 1.5 billion locations in the human genome, and classify these variations. In certain types of diseases, if certain variations exist in the genome, the individual is guaranteed to have a disease. For example, cystic fibrosis is directly related with a variation in the cftr gene.

In other cases, a low frequency variant could lead to a greater  probability to get the disease. For example, brac1 gene for breast cancer, could be used to show the probability to get breast cancer. Using AI, this probability can be accurately calculated.

For many heart diseases, a single variant does not increase the risk of a disease by a significant amount. But a combination of these variants can increase the probability. The ML models take a combination and come up with a risk score depending on the data and features as input. Other features also come in from surveys filled in by clients, for example the survey could ask LDL cholesterol values from the user, or asking if the user has ever had certain types of medications.

Based on the data and features, the ML scientists identify train, validation and test datasets and further do feature selections on the data. They also take into account the ethnicity info, and then select the best model which most relates to a particular customer.

Making sure the ethnicity information is handled properly and following all research compliances are some challenges faced by the company. They also need to be careful and fit all data and ensure they are not doubling memory, in the process as they work with huge datasets. Class Imbalance is also another challenge which comes up frequently for these data scientists.

There are various stories, when the 23&me kit has been useful. It has helped find an adoptee find her birth family, helped provide detailed explanations about oneself,  and helped guide lifestyle changes. The reports can predict a tendency to develop a certain kind of disease In future, so the user can take appropriate care and take actions to prevent it.

There are multiple reports provided by 23&me which help in that such as Ancestry Reports, Family Tree, Trait reports, Carrier Status reports, Wellness reports and Family Health History Tree. It would be interesting to get a report myself, and I would love to know more about the company and interview people working at 23&me.

About the Author

Urmil Shah is a Masters Student at the University of Southern California studying Computer Science. He finds AI and its applications to healthcare very interesting and is looking for a career where his skills can make a positive impact on society.