UIUC free online courses on data mining starting on 9 Feb, lectured by Prof. Jiawei Han et al.

by Yanchang Zhao, RDataMining.com

A series of free online data mining courses will start on 9 Feb 2015, lectured by Prof. Jiawei Han and several other staff at UIUC. Prof. Han is one of the top data mining researchers around the world, and has authored “Data Mining: Concepts and Technique”, one of the most popular data mining textbooks. Do not miss the opportunity if you are interested in learning data mining techniques.

Course 1. Pattern Discovery in Data Mining, by Prof. Jiawei Han
https://www.coursera.org/course/patterndiscovery
Start: 9 Feb 2015
End: 8 Mar 2015

Course 2. Text Retrieval and Search Engines, by Chengxiang Zhai
https://www.coursera.org/course/textretrieval
Start: 16 Mar 2015
End: 12 Apr 2015

Course 3. Cluster Analysis in Data Mining, by Prof. Jiawei Han
https://www.coursera.org/course/clusteranalysis
Start: 27 Apr 2015
End: 24 May 2015

Course 4. Text Mining and Analytics, by Chengxiang Zhai
https://www.coursera.org/course/textanalytics
Start: 8 Jun 2015
End: 5 Jul 2015

Course 5. Data Visualization, by John C. Hart
https://www.coursera.org/course/datavisualization
Start: 20 Jul 2015
End: 16 Aug 2015

You can join above 5 courses for free. However, if you want to get a verified certificate, you may choose to pay $55 for each individual course, or take the whole set of Data Mining Specialization courses, which includes above 5 courses and a Capstone project. See details at the https://www.coursera.org/specialization/datamining/20.

Posted in Data Mining | Tagged | 2 Comments

Free online data mining and machine learning courses by Stanford University

by Yanchang Zhao, RDataMining.com

Three free online data mining and machine learning courses lectured by professors at Stanford University started in past two weeks, which provide excellent opportunities to learn advanced data mining and machine learning techniques. If you are interested, be quick to join and they are still open.

1. Machine Learning
Start: Jan 19, 2015
End: Apr 20, 2015
Instructor: Andrew Ng, Stanford University
URL: https://www.coursera.org/course/ml

2. Mining Massive Datasets
Start: Jan 31, 2015
End: Mar 25, 2015
Instructors: Jure Leskovec, Anand Rajaraman and Jeff Ullman, Stanford University
URL: https://www.coursera.org/course/mmds

3. Statistical Learning (with R)
Start: Jan 20, 2015
End: Apr 5, 2015
Instructors: Prof. Trevor Hastie, Prof. Rob Tibshirani, Stanford University
URL: https://class.stanford.edu/courses/HumanitiesandScience/StatLearning/Winter2015/about

Visit RDataMining.com for more news on free online courses and webinars on data mining and analytics.

Posted in Big Data, Data Mining, R | Tagged , , | 2 Comments

Canberra IAPA Seminar – Text Analytics: Natural Language into Big Data – 17 February

Topic: Text Analytics: Natural Language into Big Data
Speaker: Dr. Leif Hanlen, Technology Director at NICTA
Date: Tuesday 17 February
Time: 5.30pm for a 6pm start
Cost: Nil
Where: SAS Offices, 12 Moore Street, Canberra, ACT 2600
Registration URL: http://www.iapa.org.au/Event/TextAnalyticsNaturalLanguageIntoBigData

Abstract:
We outline several activities in NICTA relating to understanding and mining free text. Our approach is to develop agile service-focussed solutions that provide insight into large text corpora, and allow end users to incorporate current text documents into standard numerical analysis technologies.

Biography:
Dr. Leif Hanlen is Technology Director at NICTA, Australia’s largest ICT research centre. Leif is also an adjunct Associate Professor of ICT at the Australian National University and an adjunct Professor of Health at the University of Canberra. He received a BEng (Hons I) in electrical engineering, BSc (Comp Sci) and PhD (telecomm) from the University of Newcastle Australia. His research focusses on applications Machine Learning to text processing.

Please feel free to forward this invite to your friends and colleagues who might be interested. Thanks.

Posted in Big Data, Data Mining | Tagged , | 5 Comments

Recordings of RStudio Webinar Series on Essential Tools for Data Science with R

by Yanchang Zhao, RDataMining.com

RStudio recently ran a series of live webinars on Essential Tools for Data Science with R, but it is inconvenient for people from other time zones to attend. Fortunately, the recordings have been made available online, which you can watch if you haven’t attended the live webinars. Below is a list of recordings.

1. The Grammar and Graphics of Data Science
– dplyr: a grammar of data manipulation – Hadley Wickham
– ggvis: Interactive graphics in R – Winston Chang
– URL: http://pages.rstudio.net/Webinar-Series-Recording-Essential-Tools-for-R.html

2. Reproducible Reporting
– The Next Generation of R Markdown – Jeff Allen
– Knitr Ninja – Yihui Xie
– Packrat – A Dependency Management System for R – J.J. Allaire & Kevin Ushey
– URL: http://pages.rstudio.net/Webinar-Reproducible-Reporting.html

3. Interactive Reporting
– Embedding Shiny Apps in R Markdown documents – Garrett Grolemund
– Shiny: R made interactive – Joe Cheng
– URL: http://pages.rstudio.net/Interactive-Reporting.html

Posted in R | Tagged | 4 Comments

R and Data Mining – Examples and Case Studies now in Chinese

My book titled R and Data Mining – Examples and Case Studies now has its Chinese version, translated by researchers at South China University of Technology, and published by China Machine Press in September 2014. It is sold in China only, at a price of RMB 49 Yuan. If you are in China, it is an opportunity to get a copy of the book at a bargain price.

Details of the book is available at http://www.rdatamining.com/books/rdm, and its original English version can be bought from Amazon at http://www.amazon.com/Data-Mining-Examples-Case-Studies/dp/0123969638.

Its first 11 chapters can be downloaded for free at http://www.rdatamining.com/docs, and R code and data for the book are available at http://www.rdatamining.com/books/rdm/code.

RDataMining book in Chinese

Posted in Data Mining, R | Tagged , | 2 Comments

R and Data Mining Workshop at AusDM 2014, Brisbane, 27 November

R and Data Mining Workshop at AusDM 2014
http://ausdm14.ausdm.org/workshop

There will be a half-day workshop on R and Data Mining at the AusDM 2014 conference in Brisbane, Thursday afternoon, 27 November. The workshop will be composed of several sessions on data mining with R, including

  • Introduction to Data Mining with R
  • Association Rule Mining with R
  • Text Mining with R — an Analysis of Twitter Data
  • Regression and Classification with R
  • Data Clustering with R

Examples of R code will be presented at all sessions. At the end of every session, attendees will have 10 to 15 minutes to practice with the provided R code on computers.

If you are interested in attending the workshop or AusDM 2014, you can still register for the conference by Wednesday 26 November at http://ausdm14.ausdm.org/registration. The workshop is included in conference registration.

If you cannot attend the conference, you can find the workshop details and download its slides at http://ausdm14.ausdm.org/workshop.

Posted in Data Mining, R | Tagged , | Leave a comment

Slides of keynote speeches, tutorials and panelist presentations at IEEE Big Data 2014

Slides of keynote speeches, tutorials and panelist presentations at the 2014 IEEE International Conference on Big Data can be found at the conference website at links below.

(1) Keynote speech
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
– Never-Ending Language Learning, Tom Mitchell – E. Fredkin University Professor, Machine Learning Department, Carnegie Mellon University
– Smart Data – How you and I will exploit Big Data for personalized digital health and many other activities, Amit Sheth, LexisNexis Ohio Eminent Scholar, Kno.e.sis – Wright State University
– Addressing Human Bottlenecks in Big Data, Joseph M. Hellerstein, Chancellor’s Professor of Computer Science, University of California, Berkeley and Trifacta

(2) Tutorials
http://cci.drexel.edu/bigdata/bigdata2014/tutorial.htm
– Big Data Stream Mining
Presenters: Gianmarco De Francisci Morales, Joao Gama, Albert Bifet, andWei Fan
– Big ML Software for Modern ML Algorithms
Presenters: Eric P. Xing and Qirong Ho
– Large-scale Heterogeneous Learning in Big Data Analytics
Presenters: Jun Huan
– Big Data Benchmarking
Presenters:  Chaitan Baru and Tilmann Rabl

(3) Panel: Big Data Challenges and Opportunities
http://cci.drexel.edu/bigdata/bigdata2014/panel.htm

Posted in Big Data, Data Mining | Tagged , | 1 Comment