Podium’s “Intelligent Data Identification” Automates Business Insights Through its Smart Data Catalog

Business data ready in minutes with a rules engine and more than 80 profiling and validation statistics


BOSTON, Oct. 17, 2017 (GLOBE NEWSWIRE) -- Podium Data, Inc., creator of the Podium Data Marketplace™ today announced a new platform capability that automatically detects and handles high value insights mined from big data in Hadoop, and beyond. Podium’s new Intelligent Data Identification fortifies the data marketplace, combining a smart data catalog with a pattern recognition engine.

“451 Research believes that the need for data to be filtered, processed, treated and managed to make it suitable for multiple analytics use cases is critical to delivering value from the data lake,” said Matt Aslett, research director, data platforms and analytics, 451 Research. “Data governance and self-service data preparation are key elements of functional data lakes and associated data marketplaces, with machine learning-driven insights and recommendations an increasingly important aspect of accelerating the generation of value from enterprise data.”

Initially developed in partnership with a top 10 North American bank to automatically detect and protect personally identifiable information (PII), this capability is ideally suited to identify duplicate data, improve data governance, and reveal potential data corruption problems.

“This is the first step in using Podium’s smart data catalog to create additional business insights into the content and value of the data,” said Paul Barth, CEO of Podium Data. “Over time, this will enable many data sources to automatically share information based on their structure and meaning. As this evolves with machine learning and AI, soon users will not search to find data, the data will find the users.”

A Smart Data Catalog

Rich Metadata must have a foundational role for any business in a data driven environment. The metadata repository – aka “smart data catalog” – provides trusted, accurate information with pattern matching technology to automate high-value actions such as

  • Sensitive Data Detection – A Code-free, customizable rules engine that enables field-level pattern detection of specific elements in datasets. A powerful tool for the auto-detection of – and action on – sensitive data such as
    - Personally Identifiable Information (PII),
    - Payment Card Industry (PCI-DSS) data,
    - Health Insurance Portability & Accountability (HIPPA) and,
    - Improving data governance in prep for the European Union’s General Data Protection Regulation (GDPR).
     
  • Duplicate Data Detection – Data silos lead to multiple copies of the same data. Moving the silos to the data lake does not remedy this problem because you need insight to determine exact duplicates. Podium detects duplicates via its analysis processes giving businesses confidence to eliminate duplication.

Metadata Drives Podium’s Intelligent Data Identification

Podium automatically analyzes, summarizes and discerns factors on every piece of data giving businesses insight up front, so users can make the next right move without delay. 

At ingest, Podium automatically

  • Collects metadata from source system, documenting the expected source schema,
  • Looks through the full data set, validating each record against expected format; sorting the records into good, bad, and ugly bins,
  • Builds a statistical profile of each field of data, generating statistical values and new profiling data like min/max values or frequency distribution values,
  • Combines profiled metadata with a rules-based engine to identify, tag, and obfuscate sensitive data.

For more about Podium’s Intelligent Data Identification

Hear about it – play the webinar
Read about it – here
Experience it – request a demonstration.

About Podium Data

Podium Data is radically simplifying and accelerating the way companies manage, prepare and deliver business-ready data – the lifeblood of the modern enterprise. The Podium Data Marketplace is a turnkey big data management platform that goes beyond data lakes to give business analysts self-service, on-demand access to trusted data while ensuring quality and control. More information is available at www.podiumdata.com and follow us on Twitter at @PodiumData.

Media Contact
Glen Zimmerman gzimmerman@podiumdata.com