CFP: AusDM 2016 paper submission extended to 2 Sept

14th Australasian Data Mining Conference (AusDM 2016)
Canberra, Australia,
6-8 December 2016
The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data mining. AusDM’16 seeks to showcase: Research Prototypes; Industry Case Studies; Practical Analytics Technology; and Research Student Projects.

Publication and topics
We are calling for papers, both research and applications, and from both academia and industry, for presentation at the conference. Accepted papers will be published in an up-coming volume (Data Mining and Analytics 2016) of the Conferences in Research and Practice in Information Technology (CRPIT) series by the Australian Computer Society which is also held in full-text on the ACM Digital Library. AusDM invites contributions addressing current research in data mining and knowledge discovery as well as experiences, novel applications and future challenges.

Submission of papers
– Academic submissions: Regular academic submissions can be made in Research Track reporting on research progress, with a paper length of between 8 and 12 pages in CRPIT style.
– Industry submissions: Submissions can be made in the Application Track to report on specific data mining implementations and experiences in governments and industry projects. Submissions in this category can be between 4 and 8 pages in CRPIT style.
– Industry Showcase submissions: Submission from industry and government on an analytics solution that has raised profits, reduced costs and/or achieved other important policy and/or business outcomes can be made in this track with a one page Abstract only.

Online submission system

Important Dates
Paper Submission: extended to 6pm, Friday 2 Sept 2016, Australian Eastern Standard Time (AEST)
Authors Notified: Monday 24 October 2016
Camera Ready Submission: Monday 7 November 2016
Conference Dates: 6-8 December 2016

Seminar: Data Mining for Biosecurity Regulation, University of Canberra, Wednesday 10 Aug 2016

Topic: Data Mining for Biosecurity Regulation
Speaker: A/Prof. Andrew Robinson, Melbourne University
When: 4:30pm-5:30pm, Wednesday 10 Aug 2016
Where: 9A1 (Building 9, Room A1), University of Canberra.

The Department of Agriculture and Water Resources (the department) seeks to mitigate the inherent biosecurity risk of various pathways by various control measures. This presentation focuses on the deployment of data-mining tools on a collection of data resources held by the department. The overall results of the data mining exercises were very encouraging; we developed statistically reliable models that produced operationally realistic predictions. We discuss the benefits and challenges of statistical analysis of operational data resources.

About the speaker:
Andrew Robinson is Reader and Associate Professor in applied statistics, and deputy director of the Centre of Excellence for Biosecurity Risk Analysis (CEBRA), at the University of Melbourne. Professor Robinson spends much of his time thinking about biosecurity at national borders, including analyzing inspection and interception data using statistical tools, designing and trialing inspection surveillance systems, developing metrics by which regulatory inspectorates can assess their performance, and discussing all of the above with interested parties. He is a co-author of three books: Introduction to Scientific Programming and Simulation Using R, Forest Analytics with R, and Methods of Statistical Model Estimation.


Canberra Data Miners: Seminar on Text, Knowledge and Information Extraction, by Dr Lizhen Qu (NICTA), Canberra, 4:30-5:30pm, Tuesday 1 Sept

Topic: Text, Knowledge, and Information Extraction
Speaker: Dr. Lizhen Qu, Researcher at NICTA
Organizer: Canberra Data miners Meetup Group
Date and time: 4:30-5:30pm, Tuesday 1 Sept
Location: Teal Room of Inspire Centre, University of Canberra, Building 25, University of Canberra, Pantowora St, Bruce


Unstructured text is exploding at an astounding rate. Managing documents, mining interesting information from text, making decisions based on large volume of text impose a big challenge in this era. One solution is to apply information extraction (IE) techniques, which map unstructured text into structured knowledge representation, and store them into existing databases or knowledge bases. Then we can apply existing data analytics tools based on structured data for diverse purposes. In this talk, I will walk you through the core IE techniques such as named entity recognition, named entity disambiguation, and relation extraction, as well as their real-world applications. I will also cover our ongoing work regarding harvesting domain specific knowledge by using deep learning techniques.


Dr. Lizhen Qu is currently a researcher at the Machine Learning Research Group of National ICT Australia (NICTA), a research fellow at Australian National University. He was an invited speaker at Machine Learning Summer School in Sydney in 2015. Prior to being employed at NICTA, Lizhen Qu was a post-doc at Max Planck Institute for Informatics. Dr. Qu completed his PhD doctorate qualification in Sentiment Analysis from Max Planck Institute for Informatics and University of Saarland.  In 2008, he received the Diploma degree from the Computer Science Department at Technical University of Kaiserslautern. His main research focus is in natural language processing (NLP), with particular emphasis on machine learning approaches. He is especially interested in devising deep learning models to extract structured representations of knowledge from unstructured text. More details about Dr. Qu can be found at

Yanchang Zhao
Organizer of Canberra Data Miners Group

Slides of 10+ excellent tutorials at KDD 2015: Spark, graph mining and many more

by Yanchang Zhao

I attended the KDD 2015 conference in Sydney last week. At the conference, there were more than 10 tutorials and I went to two of them, which are 1) Graph-Based User Behavior Modeling: From Prediction to Fraud Detection, and 2) Large Scale Distributed Data Science using Apache Spark. Both tutorials were very popular and the rooms were full, with some audience standing and some sitting on the floor.

The speakers and the conference organizers kindly provided the tutorial slides online at I strongly suggest you to have a look at the slides, if you haven’t attended the conference. Below are a list of tutorials at the conference.

– VC-Dimension and Rademacher Averages: From Statistical Learning Theory to Sampling Algorithms
– Graph-Based User Behavior Modeling: From Prediction to Fraud Detection
– A New Look at the System, Algorithm and Theory Foundations of Large-Scale Distributed Machine Learning
– Dense subgraph discovery (DSD)
– Automatic Entity Recognition and Typing from Massive Text Corpora: A Phrase and Network Mining Approach
– Big Data Analytics: Optimization and Randomization
– Big Data Analytics: Social Media Anomaly Detection: Challenges and Solutions
– Diffusion in Social and Information Networks: Problems, Models and Machine Learning Methods
– Medical Mining
– Large Scale Distributed Data Science using Apache Spark
– Data-Driven Product Innovation
– Web Personalization and Recommender Systems

Another good news is, most (if not all) presentations at KDD 2015 have been video recorded, so hopefully the videos will be available at its website soon.

Posted in Big Data, Data Mining | Tagged , | Leave a comment a mirror site of for Chinese users now has a mirror website at Users in China can download RDataMining documents, code and data at above mirror site, if no access to

Note that will still be the primary site and please visit only when you have no access to the primary site.

Please feel free to let me know if you have access to neither of two sites below. Thanks.

Contact: Yanchang Zhao <yanchang(at)>

Call for participation: AusDM 2015, Sydney, 8-9 August

The 13th Australasian Data Mining Conference (AusDM 2015)
Sydney, Australia, 8–9 August 2015

The Australasian Data Mining Conference is devoted to the art and science of intelligent data mining: the meaningful analysis of (usually large) data sets to discover relationships and present the data in novel ways that are compact, comprehensible and useful for researchers and practitioners.

This conference will bring together the Data Mining and Business Analytics community researchers and practitioners to share and learn of research and progress in the local context and new breakthroughs in data mining algorithms and their applications.


Discovering Negative Links on Social Networking Sites
Prof Huan Liu, Arizona State University

Large Scale Metric Learning using Locality Sensitive Hashing
Prof Ramamohanarao Kotagiri, University of Melbourne

Big Data for Everyone
Prof Jian Pei, Simon Fraser University

Big Data Mining and Data Science
Prof Yong Shi, Chinese Academy of Sciences

Deep Broad Learning – Big Models for Big Data
Prof Geoff Webb, Monash University

Algorithm acceleration for high throughout biology
Prof Wei Wang, University of California, Los Angeles

Big Data Analytics in Business Environments
Prof Hui Xiong, State University of New Jersey

On Mining Heterogeneous Information Networks
Prof Phillip Yu, University of Illinois at Chicago

Resource Management in Cloud Computing Systems
Prof Albert Zomaya, University of Sydney

Big Data Algorithms and Clinical Applications
A/Prof Yixin Chen, Washington University

Defining Data Science
Prof Yangyong Zhu, Fudan University

Learning with Big Data by Incremental Optimization of Performance Measures
Prof Zhi-Hua Zhou, Nanjinf University

Accepted Papers

Research Track:

FSMEC: A Feature Selection Method based on the Minimum Spanning Tree and Evolutionary Computation
Amer Abu Zaher, Regina Berretta, Ahmed Shamsul Arefin and Pablo Moscato

Mining Productive Emerging Patterns and Their Application in Trend Prediction
Vincent Mwintieru Nofong

Detection of Structural Changes in Data Streams
Ross Callister, Mihai Lazarescu and Duc-Son Pham

Multiple Imputation on Partitioned Datasets
Michael Furner and Md Zahidul Islam

Particle Swarm Optimisation for Feature Selection: A Size-Controlled Approach
Bing Xue and Mengjie Zhang

Complement Random Forest
Md Nasim Adnan and Zahid Islam

Aspect-Based Opinion Mining from Product Reviews Using Conditional Random Fields
Amani Samha, Yuefeng Li and Jinglan Zhang

On Ranking Nodes using kNN Graphs, Shortest-paths and GPUs
Ahmed Shamsul Arefin, Regina Berretta and Pablo Moscato

Link Prediction and Topological Feature Importance in Social Networks
Stephan Curiskis, Thomas Osborn and Paul Kennedy

AWST: A Novel Attribute Weight Selection Technique for Data Clustering
Md Anisur Rahman and Md Zahidul Islam

Genetic Programming Using Two Blocks To Extract Edge Features
Wenlong Fu, Mengjie Zhang and Mark Johnston

Designing a knowledge-based schema matching system for schema mapping
Sarawat Anam and Byeong Ho Kang

A Differentially Private Decision Forest
Sam Fletcher and Md Zahidul Islam

Industry Track:

Improving Bridge Deterioration Modelling Using Rainfall Data from the Bureau of Meteorology
Qing Huang, Kok-Leong Ong and Damminda Alahakoon

An Industrial Application of Rotation Forest: Transformer Health Diagnosis
Tamilalagan Natarajan, Duc-Son Pham and Mihai Lazarescu

Non-Invasive Attributes Significance in the Risk Evaluation of Heart Disease Using Decision Tree Analysis
Mai Shouman and Tim Turner

An Improved SMO Algorithm for Credit Risk Evaluation
Jue Wang, Aiguo Lu and Xuemei Jiang

The 2015 Big Data Summit, 9-10 August 2015, collocated with ACM KDD 2015, Sydney

The 2015 Big Data Summit
9-10 August 2015
collocated with ACM KDD 2015, Sydney

We take this privilege opportunity to invite you to
participate in the 2015 Big Data Summit:
• Co-located with ACM KDD2015
• Plenary sessions and keynote speeches by world
industrial and academic leaders
• Big data best practices and highlights in Australia
and New Zealand
• “Big Data in China” Forum
• “Data Science in India” Forum
• “Big Data in Asia” Panel

The theme of this year’s Big Data Summit is “Data to

Since the Summit’s inception in 2012, we have seen
increasing interest and investment within both
industry and government in data-led innovation and
industralisation to deeply explore big data universe,
invent data science, train data engineers and scientists,
and develop the data economy.

This year’s event aims to provide analytics professionals
and academia with a global and regional perspective to
outline the big data research, education and development
in the Asia Pacific region, showcase best practices,
explore thought-provoking insights, and demonstrate
solutions and lessons learned across industry,
government and academia.

Who Should Attend?
• Data modellers and business analysts
• Analytics professionals
• Business decision makers
• Policy executives
• Senior Government Representatives
• Academics (including research students)

What Are the Trends and Topics?
• The Data2Economy Agenda: Challenges, Trends and
• Latest Scientific Development in Data and Analytics
• The progress and future of big data in Australia and
New Zealand
• The progress and future of big data in China
• The progress and future of data science in India
• Future of Data Science and Analytics Science
• Data Economy and Industrial Transformation
• Competency, Policies and Processes
• Data Analytics Case Studies and Showcases

Why You Should Attend?
Started in 2012, the 2013 and 2014 Big Data Summit
(Sydney and Canberra) attracted over 250-300
participants from industry, government and academia.
This annual Australian Summit provides a premier and
unique forum for bridging the gaps between academia,
industry and government, and independent insights on the
advancement, best practices, trends and controversies
about data science, big data and data economy.

With very prestigious speakers across academic, industry
and government from China, India, Australia, and USA
and Europe, the 2015 Summit will cover a broad spectrum
of big data and analytics aspects and domains. The three
regional Forums organized by India, China and ANZ will
present first-hand view about progress and opportunities
in the Asia Pacific region. The “Big Data in Asia” Panel
will feature world leaders from both the Asia Pacific and
global communities, to draw a big picture of big data
innovation, services, education and economy.

Co-located with ACM KDD2015 in Hilton Sydney, BDS2015
attracts global interest, will mark a unique and high
quality opportunity for you and your organization to grasp
the cutting-edge and thought-leading progress, network
with peers and thought leaders, and most importantly,
dig out more insights and value from your big data and
lift your competency in the increasingly competitive and
challenging market and environment.

For More Information
For more details about the Summit, please visit the Website

Registration to BDS2015 will be free of charge, please Check
and Register via
and Register via

For any other inquiries about the Summit, please feel free to
Contact us(

