top of page

Features

Uncover Data Brilliance : Embrace Sparkflows for Analytics Excellence

Generative AI

Gen AI App Framework

Hugging Face Integration

Cloud-LLM Integration

AWS Bedrock

Azure OpenAI

Copilot

Content Synthesis

Google Gemini

RAG & Vector DB Support

ChatGPT

Gen AI Solution Patterns

Nvidia

Content Generation 

Database Reporting

OCR

Chat Agent

Visual Application Development​

​Build workflows by dragging and dropping

Rich collection of 500+ Processors

View results of previous runs

Machine Learning

Classification / Clustering / Regression

Collaborative Filtering

Save / Load Model / Predict

Cross Validator

Forecasting / Deep Learning

What If

Machine Learning Engines

SparkML

H2O

Tensorflow

Scikit Learn

XGBoost

Data Preparation

Prepare Data Seamlessly

Connect to various Sources & Sinks

Filter Data, Joins, Groups, Data Validation, Impute etc.

Connect to various Sources & Sinks

Batch sources : HDFS, Apache HIVE, Amazon S3

Streaming sources : Kafka, Flume

NoSQL sources : HBase, Solr, ElasticSearch

File Formats

Work with a variety of file formats including CSV/TSV, Avro, Parquet, JSON.

Intelligent Schema Inference for the various Datasets

NLP/OCR

​Perform NLP on large scale data with Apache OpenNLP & StanfordNLP

Perform OCR with Tesseract

Multi-tenancy & User Management​​​

Users can share Datasets and Workflows with groups

Create users with different roles & permissions

LDAP Integration

Visualization​

View output of workflows as Linechart, Histogram, Barchart

View Random forests visually

Feature Generation​

Tokenization

TF-IDF, One Hot Encoder

String Indexer, Impute, Scaler

Developer Toolkit​

​​Add code using SQL, Scala, Jython nodes

Develop custom Nodes and have them available in Workflows

REST APIs​

​​Access Sparkflows with a rich set of REST APIs

Workflows/Datasets/Dashboards/Execute Workflows/Access Result of Execution/Browse HDFS/Browse HIVE

Dashboards​

Assemble the output of various workflows and nodes into a Dashboard

Build Dashboards from Relational Sources, adding filtering & drill down capabilities

Workflow Scheduling

Schedule workflows to be run a various time of the day/week/month

Trigger workflows by events in a Kafka topic.

Streaming Analytics

Connect to Apache Kafka, Apache Flume, Sockets, Twitter

Perform Streaming Analytics

Load results into Apache HBase, Apache Solr, Elastic Search etc

MLOps

Register

Deploy to Endpoints

Monitor

Track Feature Drift

Auto Retrain

Alert & Notifications

workflow-editor-mockup.png

The Sparkflows Experience

Execute_workflow.png

350+

bottom of page