Accurate & intelligent content classification at scale


To enhance your data management strategy, SkySync machine learning modules were architected from the ground-up to classify unstructured data at both scale and enhanced accuracy. Its pre-trained A.I. engines dramatically reduce IT effort to rapidly produce accurate results for PII, CCPA and other compliance mandates.

The training module is designed to customize and improve the A.I. modules’ accuracy for your organizations content specifically. It includes a feedback loop to help improve the classification accuracy over time with day-to-day system usage.

Leverage your investment in MIP

Microsoft Information Protection (MIP) extends the labeling and classification functionality provided by Microsoft 365, and can co-exist with SkySync and complement each other by providing depth and breadth to an organization’s overall data management posture.

A SkySync custom policy can be created to detect and append a MIP data classification metadata label. Based on the label, SkySync can take the necessary actions to further track and protect [move] the file outside/inside of Microsoft.​

  • Read, list, and use MIP labels in SkySync rules
  • Import MIP policy and label settings
  • Import MIP models
  • Extensions to use MIP classifier
  • Automatically apply MIP labels
  • Trigger MIP protection actions or RMS [Rights Management Server] protection
  • Compare SkySync classification results to user labeling with a Sensitivity Label Audit report

A wide array of classification methods

Select from an existing policy template library of recommended rules for your use case or customize from a variety of options including—location, size, extension, title, timestamps, sharing permissions, as well as owner and author information. File content rules are based on the data in the file and analyzed by advanced pattern matching and various A.I. modules for classification:

  • Advanced Pattern Matching
  • PII Identification and Extraction
  • Document Type Classifier
  • Standardized Form Matcher
  • Language Detection

Reduce false positives, improve accuracy and discover common sensitive values

  • Pre-defined common regular expressions from an ever growing library
  • Regular expressions, validation routines, and keyword proximity
  • Social security numbers, driver’s license, bank numbers, credit cards, financial account ID’s, government document ID’s

Identify and extract personal data from files via deep A.I.

  • Names, ages, addresses, dates of birth, phone numbers, social security numbers, etc.
  • Identify PII entities from structured forms, as well as unstructured text using deep learning technology
  • Take action based on PII type, such as “label all documents with addresses as sensitive”

Identify and categorize documents as well as federal and state government forms

  • Identify files across 100 classification categories and 5,000 Federal & State Government forms
  • Identify forms by name & take action if a form has been filled in with “sensitive” information
  • Resumes, Bank Statements, Invoices, Contracts, Statements of Work, Tax Forms, Court Documents, Brochures, White Papers, Catalogs, Design Files, HR Files, Letters, Reports, Press Releases, etc.
  • IRS forms: W9, 1040, 1120, 1095-A, etc.
  • FDA documents: 0356h, 3911, 3926, etc.

Identify and group documents by language

  • Detect and identify an items written language from over 175 supported options
  • Use the item language in criteria for matching rules or to optimize additional A.I. processing
  • Use confidence scores to identify ambiguous cases and non-natural languages [code, machine generated files, etc.]
163

Zetabytes of data will be created per year by 2025