Supercomputer Center Begins Data Mining Boot Camps

Business leaders and academic professionals will have the opportunity to train in evaluating data and models.

UCSD’s San Diego Supercomputer Center is developing the 2014 series of “Data Mining Boot Camps” to help business professionals and academic research scientists gain a clear understanding of translating data and learning how to design, build, verify and evaluate models.

SDSC’s Predictive Analytics Center of Excellence developed two-day sessions to provide training and tools to non-computer science professionals. With the program, they can detect patterns and relationships. This program was initially launched in October 2012, and it aids organizations by improving analytical skills to transform people into talented data scientists while preparing managers and analysts to perform deep examinations of large and various data sets.

“Conventional statistical analysis and business intelligence software, although useful, are not designed to capture, curate, manage and process large quantities of data generated by most enterprises,” PACE director Natasha Balac said to UCSD News. Data Mining Boot Camps thus provides a way for professionals to learn better tools to make sense of this deluge of information.

Our society creates 2.5 quintillion bytes of data daily, and researchers must make sense of all this information — leading to a rise in demand for data scientists. The boot camp sessions have attracted a wide variety of industry participants and business sectors such as food services and the gaming industry.

“Data mining and predictive modeling, now commonly referred to as data science, are capable of automatic extraction of meaningful value hidden in this data, enabling discovery of new insights and providing a competitive edge,” Balac said. Managers and analysts will be equipped with the necessary tools to process the vast volumes of data.

The PACE boot camps expanded from the data mining certificate course offered through UCSD Extension. They cover basic data mining, data analysis, pattern recognition concepts and predictive modeling algorithms. These camps allow participants to shape the analyses for their own data.

Boot camp participants will also be able to use SDSC’s Gordon, a supercomputer with 300 terabytes of flash memory. The camps will allow participants to apply data mining algorithms to actual data, allowing for hands-on training. In addition, the classroom allows the instructors to work with students in a one-on-one setting.

“Gathering data is easy. In fact, it’s so easy it’s exceeding our capacity to validate, analyze, visualize, store and curate,” SDSC director Mike Norman said on the PACE website. “And many of our critical scientific problems can be solved by harnessing this data.” With the methods of analyzing data through PACE boot camps, non-computer science professionals can process and understand the massive volumes of data generated on a daily basis.