BigML is working hard to support a wide range of browsers. Your experience will be better with:
Partial Dependence Plots for Models
Image of Partial Dependence Plots for Models

As a complement to our popular decision trees visualization and the sunburst, we are launching a third view for your models: the Partial Dependence Plot. This heatmap chart also allows you to analyze the marginal impact of each input field on predictions for classification and regression models built by using ensembles and logistic regressions.

supervised classification regression visualization dashboard predictions model miscellaneous
Batch Deletion for Resources
Image of Batch Deletion for Resources

Solving a Machine Learning problem is an iterative process that requires the creation of a great number of intermediary datasets, models, evaluations and predictions to get the final model. Now, BigML simplifies it keeping your account organized and up-to-date by allowing the deletion of multiple resources at the same time. Just click the deletion icon found in the resources listing in the Dashboard, and select the resources to be deleted.

deletion resources api dashboard miscellaneous
E-mail Notifications for Scripts
Image of E-mail Notifications for Scripts

Asynchcronous WhizzML script executions can take some time to finish to fulfill complex Machine Learning workflows they implement. You no longer need to check your execution repeatedly to see if your results are ready. This new option let's you specify that you want to be notified by e-mail once the execution finishes, while you concentrate on other tasks.

whizzml scripts executions notifications
Scriptify: Reify Complex Workflows
Image of Scriptify: Reify Complex Workflows

Furthering our obsession to speed up your Machine Learning processes, we have incorporated Scriptify into your 1-click menu options. Now, you can automatically regenerate any BigML resource (models, evaluations, predictions, etc) with a single click. Scriptify creates a script that contains all the workflow information end-to-end (from configuration parameters to resources created). You can precisely repeat the processing steps of any original Machine Learning resource to your heart's desire!

whizzml worklows automation resources 1-click actions
Shared Resources Cloning
Image of Shared Resources Cloning

Now, you can easily clone datasets, models and scripts, from other users into your BigML account. Provided that a user shares a resource using the sharing link and the cloning capability is enabled, any other user with access to the link will be able to include this resource in their BigML account.

This new feature will allow you to fully use the shared resources. For example, when another user shares a dataset using the sharing link, it is in "view only" mode, so you can not perform any actions such as creating new models, exporting it, sampling it, etc. Now, by cloning it, you will be able to perform all BigML actions available for datasets.

resources cloning sharing link dashboard web
Stats Computation for Logistic Regression
Image of Stats Computation for Logistic Regression

These new Dashboard statistics allow you to introspect the predictive power of your model by revealing the significance of each coefficient estimate. BigML computes the likelihood ratio to test how well the model fits your data along with the p-value, confidence interval, standard, error and Z score for each coefficient.

Learn more about the Logistic Regression statistics in the Dashboard documentation.

logistic regression dashboard supervised stats p-value z score standard error confidence intervals likelihood ratio logisticregression
Association Predictions: Association Sets
Image of Association Predictions: Association Sets

BigML is bringing predictions for Associations to the Dashboard. Association Sets allow you to pinpoint the items which are most strongly associated with your input data. For example, given a set of products purchased by a person, what other products are most likely to be bought?

All the predicted items will be ranked according to a similarity score, and they will be displayed in a table view. You can also visualize each predicted rule in a Venn diagram to get a sense of the correlation strength between the input data and the predicted items. Read more about Association Sets in the 8th chapter of the Associations documentation.

associations predictions itemsets association rules associationset
BigML Certifications
Image of BigML Certifications

We are happy to announce BigML Certifications, for organizations and professionals that want to master BigML to successfully deliver real-life Machine Learning projects. These courses are ideal for software developers, system integrators, analysts, or scientists, to boost their skill set and deliver sophisticated data-driven solutions. We offer two separate courses, each of them consisting of 4 weekly online classes of 3 hours each:

  • Certified Engineer: all you need to know about advanced modeling, advanced data transformations, and how to use the BigML API (and its wrappers) in combination with WhizzML to build and automate your Machine Learning workflows.

  • Certified Architect: learn how to implement your Machine Learning solutions so they are scalable, impactful, capable of being integrated with third-party systems, and easy to maintain and retrain.

If you successfully pass the certification exam, BigML will award you with a diploma. In addition, BigML Certified Partners will receive business referrals that help them source new Machine Learning projects.

courses modeling api supervised unsupervised whizzml data transformations engineer architect miscellaneous
Partial Dependence Plot for Ensembles
Image of Partial Dependence Plot for Ensembles

This new visualization for ensembles, commonly known as Partial Dependence Plot, allows you to visualize the impact that a set of fields have on predictions. You will be able to determine which fields are most relevant for ensemble predictions and how sensitive your ensemble predictions are to their different values.

The chart displays a heatmap representation of your predictions based on different values of the two selected fields in the axes regardless of the rest of the fields used to train your ensemble. You can select any categorical or numeric field for the axes and configure the values for the rest of the input fields by using the fields inspector panel on the right.

supervised classification regression visualization dashboard predictions ensemble
Batch Field Importances
Image of Batch Field Importances

This feature enables you to include the field importances in your batch predictions, i.e., a set of percentages indicating how much each field in your dataset contributed to the prediction of a given instance. You can include those values in your output file and dataset either with BigML Dashboard or the API. This will give you a better understanding of your predictions as it will reveal which are the most relevant fields factoring in a given prediction.

supervised predictions batch predictions regression classification models ensembles api dashboard prediction
Topic Distributions
Image of Topic Distributions

Topic Models assume that each document exhibits a mixture of topics. The main goal of creating a Topic Model is to discover the topic importances for a given document. For example, a document may be 70% about "Machine Learning", 20% about "stock market" and 10% about "startups".

Topic Distributions allow you to make predictions for a single data instance, and Batch Topic Distributions help predict the same for multiple instances simultaneously. Based on a given Topic Model, BigML Topic Distributions provide a set of probabilities for each data instance (one probability per topic), which indicate the relative relevance of all topics for that instance.

TopicModels TopicDistributions BatchTopicDistributions TopicProbabilities API Dashboard Fall2016 Unsupervised topicmodel
Topic Models
Image of Topic Models

The BigML team has brought Topic Models to the API and the Dashboard as part of Fall 2016 release. Topic Models are an optimized implementation of Latent Dirichlet Allocation, a probabilistic unsupervised learning method that determines the topics underlying a collection of documents.

Topic Models' main application areas include browsing, organizing and understanding large amounts of unstructured text data, which can be very useful for information retrieval tasks, collaborative filtering or content recommendation use cases among others.

BigML provides two original visualizations that accompany its implementation so you can better inspect your Topic Model:

  • Topic Map: get an overview of your topic importances and their thematic closeness.
  • Term Chart: get an overview of the main terms that make up your found topics.
TopicModels Topics TermChart TopicMap API Dashboard Fall2016 Unsupervised topicmodel
Flatliner code editor & evaluator
Image of Flatliner code editor & evaluator

Flatline is BigML’s Lisp-like language that enables you to programmatically perform an array of data transformations, including filtering and new field generation. Flatliner is a handy code editor (available in our Labs section) that helps you test your Flatline expressions.

dataset labs filter transform sample flatline miscellaneous
Evaluation Comparison
Image of Evaluation Comparison

You can now compare multiple evaluations against a test set in a ROC space. The graph can then be downloaded as a .PNG image, and the performance measures can be exported as a .csv for further analyses.

comparison auc evaluation labs
Google Integration
Image of Google Integration

With the Winter Release, you'll now be able to add sources to BigML through Google Cloud Storage and Google Drive, similar to our prior integrations with Dropbox and Azure Data Marketplace. You can also now log into BigML using your Google ID.

google datasources google drive google cloud storage winter2015 miscellaneous
Image of Projects

We're happy to introduce Projects to help you organize your machine learning resources. You only have to create a new project using the web interface or the API resource and update a new source to this project. All the new resources created from this source will be associated to the same project.

winter2015 projects miscellaneous
Dataset Comparison
Image of Dataset Comparison

This is another simple but useful application we have released into our new BigML Labs. It allows users to compare side by side two different datasets. Check it out here.

dataset winter2015 labs web
Sample Service
Image of Sample Service

BigML's new Sample Service provides fast access to datasets that are kept in an in-memory cache which enables a variety of sampling, filtering and correlation techniques. We have leveraged this new service to create a Dynamic Scatterplot visualization that we've released into BigML Labs.

sample dataset viz visualization winter2015 labs api
BigML Labs
Image of BigML Labs

Our team is constantly working on innovative applications built on top of BigML's API. We're now unveiling several of these in early access through our BigML Labs.

labs winter2015 miscellaneous
G-means Clusters
Image of G-means Clusters

This latest addition to BigML's unsupervised learning algorithms is ideal for when you may not know how many clusters you wish to build from your dataset.

gmeans winter2015 cluster
Cluster summary report
Image of Cluster summary report

Now you can download a Summary Report for your BigML Clusters. This report will inform you on the distribution of data across your clusters, as well as the associated features and data distances.

report cluster
BigML Comes to Australia and New Zealand
Image of BigML Comes to Australia and New Zealand

BigML is very pleased to announce that we've launched this new website to better serve our customers in Australia & New Zealand. This site will contain all of the content and functionality of our https://bigml.com site, but will provide faster performance as well as some localized content (e.g., local events and local training opportunities). Read more about in this blog post.

australia new zealand opening miscellaneous
Anomaly Detector
Image of Anomaly Detector

BigML makes it easy to build a top-performing anomaly detector that will help you identify instances in your dataset that do not conform to a regular pattern.

fraud detection summer2014 anomaly
Batch Anomaly Scores
Image of Batch Anomaly Scores

You can quickly score multiple lines of data through BigML's Batch Anomaly Score. The output can be downloaded as a .csv and/or you can use it to automatically create a new dataset.

batches fraud detection anomalyscore summer2014
Anomaly Score
Image of Anomaly Score

You can score individual data points against your anomaly detector by using the web interface. Simply input the variables and BigML will provide you with an anomaly percentage (a higher score reflects greater anomaly).

score fraud detection anomalyscore summer2014
New dataset from batch prediction output
Image of New dataset from batch prediction output

Batch predictions are a powerful way to score likely outcomes on multiple rows of data. You can now create a new dataset directly from the batch prediction output (in addition to getting the output as a .csv file).

batch prediction batch centroid batch anomalyscore workflow summer2014 dataset
Models from clusters
Image of Models from clusters

Now you can automatically create a model for each cluster that will not only help you better understand the cluster, but also use it to classify new instances.

model summer2014 cluster
Modeling with missing splits
Image of Modeling with missing splits

As we know that cleaning up data might be hard and having all the input data handy at prediction time is important, we have built a new option to create models that will generate predicates that explicitly deal with missing values.

missing values ensembles summer2014 model
Online predictions
Image of Online predictions

New client-side predictions make it easier than ever to explore the influence of each field in your models, ensembles or clusters. In addition, we are open sourcing the related Javascript libraries so you can leverage this functionality to build very powerful and dynamic apps and web services.

free client-side javascript summer2014 prediction
Fast ensembles
Image of Fast ensembles

We have refined the way the models of an ensemble are built to save a great amount of time in data transportation. This will dramatically speed up creation of your ensembles.

fast summer2014 ensemble
Sending Request...
Sending Request...