A talk to break common misconception with data science projects. Understand data projects with much simpler approach and gain huge gains from it.
Every one says data is the new oil. But do we actually know how to efficiently use it to make our customer lives better, or it’s just another silo of information.
In this talk, we will see a beautiful approach to planning data-based projects inspired by professionals from Google, Twitter, Microsoft, and more. This talk will cover the following things - 1. Planning a data project sprint 2. Establishing purpose and vision. 3. What data matters and what’s trash? 4. Mining the sentiments of users. 5. Diminishing the silos. 6. Tools
From data wrangling to Machine Learning, from Probabilistic Modelling to Optimization, Python provides all the necessary tools to develop a state-of-the-art decision science platform.
During my talk, 1) I will introduce what Decision Science is and why it matters 2) exemplify a Decision Science modelling problem with an example 3) illustrate how Python and open-source Python frameworks can help tackle all the tasks necessary to make an optimal decision
Apart from answering a question with data, data science projects often require to create a tool that uses machine learning models to do something useful. Machine learning models that do not make it to production cannot provide enough business impact. Some of the challenges to operationalize a machine learning model are technical but others are organizational.
This talk begins with introducing some key challenges and how MLOps can help to simplify some of the problems. In this talk, more focus is given on the MLOps concepts and ideas rather than the technical specifics. Tools and technologies often evolve and change quickly, but the basic principles likely remain the same.
Specifically in this talk, Tajinder covers: What is deploying to production? What are the steps for deploying a model into production? We will then delve deeper into how MLOps processes can be infused in key steps of model life cycle - Build, Preproduction, Deployment, Monitoring, and Governance.
This presentation is ideal for machine learning engineers and data scientists who want to learn more about deploying machine learning models in production. Attendees will walk away with a better understanding of the challenges while operationalizing machine learning models and how to apply MLOps processes to address some of the challenges
Pandas is an extremely popular Python package for data analysis, including cleaning and visualizing data before passing it to machine-learning models. In many ways, you can think of Pandas as "Excel inside of Python," with a wealth of capabilities that make it easy to work with data. One of the lesser-known parts of Pandas is its handling of dates and times — a crucial part of what many people use in their work. In this talk, I'll introduce the basic capabilities of Pandas when working with dates and times, including: Reading timestamps from external files, calculating time deltas, querying and comparing time data, using time columns as indexes in data frames, and calculating aggregation functions on "time series" data.
Audio Events, or Acoustic Events, are individual distinct sounds. Audio Event Detection (AED) is the task of detecting such sounds, returning precise times that each kind (class) of sound occurs. This can be anything from detecting coffee-beans cracking while roasting, to gunshots on a shooting range, to noise made by construction works - all these are real applications the presenter has developed. This practical talk will show how you can build such a system in Python, using machine learning models applied to audio. The general approaches shown can also be applied to other sensor data such as vibrations, pressure etc.
"Using Machine Learning to Predict Drug-Drug Interactions"
Using Machine Learning to Predict Drug-Drug Interactions
Drug-drug interactions are an often overlooked aspect of the medical field which can have drastic implications. During the prescription and consumption of drugs, adverse drug reactions may result which have significant impacts on one’s health. However, limitations in clinical trials mean that ADRs may only be detected when they happen after approval for clinical use. Hence, to assist in the prediction of DDIs, machine learning algorithms can be used to identify drugs with a high potential to have interactions. Our project uses data from the DrugBank database, including Anatomical Therapeutic Classification codes and Simplified Molecular Line-Entry System codes, as well as the drug interactions. We obtained 2,770 drugs with ATC and SMILES codes as valid drugs for analysis. By extracting interactions of each type into an individual CSV file, we were able to analyse the drug properties of each drug, running KNN, Decision Tree regression and classification, Random Forest regression and classification, and naive Bayes prediction models. The prediction classifiers used compared chemical, therapeutic and interactive similarities of each drug to predict if the test set would have an adverse reaction. We then ran various metrics on the models, finding that Decision Tree produces the best classification and regression model for the prediction of DDIs. While the limitations of our project included lack of fully comprehensive data which resulted in a fairly small sample size, with proper access to information, such a method can be expanded to provide accurate and reliable results.
The use of AI is all-pervasive across industries now. The adoption has picked up pace mostly in the last 5 years and shows no signs of slowing down. This would take us to a world where AI will be involved in making decisions related to almost everything in our lives - from whether an applicant is eligible for a mortgage, if a patient’s scan shows cancer to which route you should take for your commute and which packet of peanut butter you buy based on the search results.
In this talk we make a case that an independent regulator is needed to create the standards and the guidelines for the adoption of the technology across industries. Expecting regulators for specific industries will lead to inconsistent standards and may also leave most of the industries without properly defined standards at best or at worst with no regulatory oversight on how the technology is being used.
While offline events are temporarily gone, Geekle never stops! We are running the Data Science Global Summit on April 8, 2021. Our speakers are leading experts from top companies all over the world who are ready to share what challenges and prospectives expected for the Python community.
Geekle has the unique experience to gather huge tech summits with 10'000+ attendees in different tech domains. We hope to create something the world has never seen before for the Data Science Community!
See you all!
Scan the Zhihu QR code to follow us
Scan the Weibo QR code to follow us
If you have any questions during the registration process, please scan the QR code to contact our ticket manager on WeChat
Geekle Corp. 910 Foulk Road, Suite 201 Wilmington, DE 19803, USA
Junior Track ticket (free live stream access to the Python for ML and AI - April 8-9)
Free — Quantity tickets x (1)
Click to order
Contact the organizer if you need a refund
For the convenience of our clients, Geekle US uses Stripe for its online credit card transactions. Stripe processes online credit card transactions for thousands of merchants, providing a safe and secure means of collecting payments via the internet. For more information about Stripe and online credit card payments, please visit https://stripe.com
We may from time to time make calls and/or send text messages to you at any telephone number associated with your account. You certify, warrant, and represent that the telephone number you have provided to us is your contact number. You agree that Geekle US may send emails to you at any email address you provide us, or use other electronic means of communication to the extent permitted by law.
I hereby certify that there is no restriction on the right to process the personal data transferred by me to the Company, as well as my voluntary consent to use such personal data by the Company, granting the right to process and access the Company's representatives and the third parties for the above purposes for the period necessary to achieve the objective of processing or until the withdrawal of my consent.
I am also aware that at any time I may request and/or change my personal data being at the Company’s disposal, as well as withdraw my consent to the processing of my personal data by emailing the Company at email.