Open in app

Sign In

Write

Sign In

Chris Kuo/Dr. Dataman
Chris Kuo/Dr. Dataman

4.2K Followers

Home

About

Published in Dataman in AI

·Pinned

Handbook of Anomaly Detection: With Python Outlier Detection — (1) Introduction

Anomaly detection is the detection of any rare events that deviate significantly from the majority of the data. Those rare events do not conform to a well-defined behavior. They are also called Outliers, noises, novelties, or exceptions. Rare events can detrimentally impact the business operation and result in a significant…

Data Science

16 min read

Handbook of Anomaly Detection: With Python Outlier Detection — (1) Introduction
Handbook of Anomaly Detection: With Python Outlier Detection — (1) Introduction
Data Science

16 min read


Published in Dataman in AI

·Pinned

Explain Your Model with the SHAP Values

Better Interpretability Leads to Better Adoption Is your highly-trained model easy to understand? A sophisticated machine learning algorithm usually can produce accurate predictions, but its notorious “black box” nature does not help adoption at all. Think about this: If you ask me to swallow a black pill without telling me…

Machine Learning

13 min read

Explain Your Model with the SHAP Values
Explain Your Model with the SHAP Values
Machine Learning

13 min read


Published in Dataman in AI

·Pinned

Transfer Learning for Image Classification — (2) Pre-trained Image Models

Image classification is the task to recognize an image. It is also called image recognition. Computer scientists have been innovative in extracting meaning from images. Its history is fascinating, though most people don’t know much about it. For this reason, I am going to tell you the stories of innovation…

Data Science

13 min read

Transfer Learning for Image Classification — (2) Pre-trained Image Models
Transfer Learning for Image Classification — (2) Pre-trained Image Models
Data Science

13 min read


Published in Dataman in AI

·Pinned

The SHAP Values with H2O Models

Many machine learning algorithms are complicated and not easy to understand, even though they have rendered an impressive level of accuracy. As humans, we must be able to fully understand how decisions are being made so that we can trust the decisions of AI systems. We need ML models to…

Data Science

9 min read

The SHAP Values with H2O Models
The SHAP Values with H2O Models
Data Science

9 min read


Published in Dataman in AI

·Pinned

Top Data Science Interview Questions and Answers

You receive a data science interview opportunity from your dream company. You have surveyed many the-top-50-question types of articles but still feel uncertain. Since there are already many similar articles, why do I dare to add an article to this crowded topic? In this article, I re-write many ordinary answers…

Data Science

18 min read

Top Data Science Interview Questions and Answers
Top Data Science Interview Questions and Answers
Data Science

18 min read


Published in Dataman in AI

·Jan 20

The Intuitions for the Discrete Distributions: Bernoulli, Binomial, Beta, Dirichlet Distributions

Machine learning uses a lot of discrete distributions such as the Bernoulli, Binomial, and Multinomial distributions to solve problems. Two related discrete distributions, the Beta and Dirichlet distributions are less known but are widely used in data science. The Dirichlet distribution is especially important in Natural Language Processing (NLP). The…

Data Science

16 min read

The Intuitions for the Discrete Distributions: Bernoulli, Binomial, Beta, Dirichlet Distributions
The Intuitions for the Discrete Distributions: Bernoulli, Binomial, Beta, Dirichlet Distributions
Data Science

16 min read


Published in Dataman in AI

·Oct 9, 2022

Handbook of Anomaly Detection: With Python Outlier Detection — (11) XGBOD

In Chapter 1, we talked about supervised learning can target better for known outliers, and unsupervised learning can explore new types of outliers. Can we take the advantage of both supervised and unsupervised learning? …

Data Science

10 min read

Handbook of Anomaly Detection: With Python Outlier Detection — (11) XGBOD
Handbook of Anomaly Detection: With Python Outlier Detection — (11) XGBOD
Data Science

10 min read


Published in Dataman in AI

·Oct 9, 2022

Handbook of Anomaly Detection: With Python Outlier Detection — (6) OCSVM

Classification problems are often solved using supervised learning algorithms such as Random Forest, Support Vector Machine, Logistic Regressor, and so on. Supervised learning algorithms require a known target to build a model. However, it is often the case that we only see normal data patterns but not rare events. The…

Data Science

10 min read

Handbook of Anomaly Detection: With Python Outlier Detection — (6) OCSVM
Handbook of Anomaly Detection: With Python Outlier Detection — (6) OCSVM
Data Science

10 min read


Published in Dataman in AI

·Oct 9, 2022

Handbook of Anomaly Detection: With Python Outlier Detection — (10) Cluster-Based Local Outlier Factor (CBLOF)

After introducing the Local Outlier Factor (LOF), we can introduce the Cluster-Based Local Outlier Factor (CBLOF). It defines anomalies as a combination of local distances to nearby clusters, and the size of the clusters to which the data point belongs. It first clusters data points into large or small clusters…

Data Science

10 min read

Handbook of Anomaly Detection: With Python Outlier Detection — (10) Cluster-Based-Local Outlier
Handbook of Anomaly Detection: With Python Outlier Detection — (10) Cluster-Based-Local Outlier
Data Science

10 min read


Published in Dataman in AI

·Oct 8, 2022

Handbook of Anomaly Detection: With Python Outlier Detection — (9) Local Outlier Factor (LOF)

The Local Outlier Factor (LOF) is another effective unsupervised learning method for outlier detection. Since its invention in the early 2000s (Breunig et al., 2000 [1]), it has been applied widely to different types of problems. It is a density-based technique that uses the nearest-neighbor search to identify anomalous points…

Data Science

12 min read

Handbook of Anomaly Detection: With Python Outlier Detection — (9) LOF
Handbook of Anomaly Detection: With Python Outlier Detection — (9) LOF
Data Science

12 min read

Chris Kuo/Dr. Dataman

Chris Kuo/Dr. Dataman

4.2K Followers

The Dataman articles are my reflections on data science and teaching notes at Columbia University https://sps.columbia.edu/faculty/chris-kuo

Following
  • Dariusz Gross #DATAsculptor

    Dariusz Gross #DATAsculptor

  • Genietalk Private Limited

    Genietalk Private Limited

  • TDS Editors

    TDS Editors

  • Kesi Parker

    Kesi Parker

  • Yin Ma

    Yin Ma

See all (208)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech