Quantcast
Channel: Blog – Talend Real-Time Open Source Data Integration Software
Viewing all 824 articles
Browse latest View live

Revealing the Intelligence in your Data with Talend Winter’20 (part 1)

$
0
0

One of my favorite Talend customer success stories is the International Consortium of Investigative Journalists (ICIJ). I love this story not only because they transformed investigation journalism with data, won the Pulitzer prize for the Panama papers, and helped the public to recover billions of dollars  lost to illegal tax evasion. The story is also fascinating because they managed to decipher intelligence out of highly disparate and unknown data that they got from some of the history’s largest data leaks. By retro-engineering massive amounts of raw data using Talend and other innovative data management tools, they revealed some of the most important stories in the world.  

This is the power of data intelligence.  But what exactly is data intelligence?  In a recent, must read blog, Stewart Bond from IDC positions data intelligence as “intelligence about data, not from data”.   He continues:

Data intelligence leverages business, technical, relational and operational metadata to provide transparency of data profiles, classification, quality, location, lineage and context;
Enabling people, processes and technology with trustworthy and reliable data.”

Under Stewart’s leadership, IDC links this data intelligence concept to a market category, the data intelligence software, as part of their data integration and intelligence software taxonomy.       

With the Winter’20 release of Talend Data Fabric, we believe we are bringing the power of data intelligence to the next level. And this is why I’m so passionate to reveal how important it is in this blog and in a series of webinars across regions on March 18th and 19th.

 Why data intelligence is critical to digital transformations

It has become commonplace to say that data is the lifeblood of digital transformation and that it affects every aspect in business – drive revenues, creates faster innovations, transforms customer experiences, and lowers costs and risks.

Unfortunately, few companies manage to really transform their business-like ICIJ did. There is a data intelligence gap. As data volumes are growing ever more enormous and data is locked away in silos, the data opportunity tends to end up in data chaos, where people can’t find the data they need, and even if they could find it, it’s often poor quality and difficult to use. In short, most businesses are not in control of their data.

This also translates into a huge efficiency and productivity crisis. As part of their data *intelligence research, IDC has shown that data professionals are spending 67% of their time searching and preparing data, and only 12% of their time delivering the needed insights to turn data into concrete business outcomes 

Lastly, there is a severe data talent shortage. Data professionals, especially specialists such as Data Engineers, AI specialists, DevOps Engineers, Data Analysts and Data Protection Officers stand out as the scarcest and in highest demand talents around the word in 2020. Not only do companies need to attract these resources and up-skill their workforce, but they need also to find ways to make their current team deliver more.

Revealing the intelligence in your data with Talend Winter ‘20

So how can Winter ’20 address those challenges and help our customers to reveal the intelligence in their data?

The data chaos can be addressed with the ability to capture data intelligence at first sight, from every data point across the data landscape. You can connect to many, many data sources (we introduced or enhanced dozens of connectors in this new release), automatically extract metadata from those sources, and document them in a single place as shared datasets with our new Talend Data Inventory. Talend will then automatically calculates a Data Intelligence Score based on data quality profiling, on data popularity, and on crowdsourced ratings and endorsements.

The efficiency crisis can be tackled with accelerated data engineering.  We brought dozens of smart new functions into Talend Pipeline Designer. Data engineers and citizen data integrators can integrate, standardize, cleanse, and enrich data in a single cloud-native, unified app while in-flight data quality eliminates problems before the data is consumed or replicated. No coding or complex transformations are required, increasing development and maintenance productivity. This makes data professionals become more productive, boosting data intelligence while shortening the time to get up to speed.

In addition, the Winter ’20 release extends the use of Artificial Intelligence (AI) across the Talend Data Fabric platform. Data Intelligence is democratized with an AI-enabled Magic Fill to shape data the way you want. Intelligent data quality brings the human in the AI loop for faster and more precise data matching and empowers data intelligence at scale.

​Try Data Fabric now!

Winter’20 is our latest advancement of Talend Data Fabric to reveal the intelligence in your data. In addition to what I covered in this blog, there are hundreds of new capabilities that you can discover here. But your transformation won’t stop here. The power of the cloud allows us to deliver continuous innovation in data integration, data integrity, and data intelligence, so that we can best support any digital transformation efforts. Note that Talend 7.3, the on-premises version of our Data Fabric, is also part of this launch.

Keep in mind that a key differentiator of Talend Data Fabric is that the innovations we bring are not delivered through a set of siloed products, but through a single platform that brings and manages all kinds of data together under one roof. Talend Data Fabric brings a unified approach to data integration, quality, governance, and data sharing among stakeholders.

Finally, and because Talend Data Fabric is delivered as an iPaaS, you are just a few clicks away from being able to reveal the intelligence in your data and therefore transform your business with data like ICIJ, and many other Talend customers did. Why wouldn’t you try Winter’20 now?

The post Revealing the Intelligence in your Data with Talend Winter’20 (part 1) appeared first on Talend Real-Time Open Source Data Integration Software.


You can trust us: we are HIPAA compliant

$
0
0

 

Talend is HIPAA compliant

 

Can you keep a secret? What will it take for me to trust you to keep and protect a secret that I share with you? If you are a friend or family member, I may not need more than you saying “Yes”, but if I don’t know you, I will likely want additional guarantees or proof that I can trust you.

This is particularly true if you are an organization handling personal information about me. In such case, I will want to be reassured by others that you are trustworthy and that my information will be safe with you.

In our digital world where data flows easily and invisibly, and cybercrime increases, it is becoming harder to reassure us that our data will be safe with anyone.

Recent data protection regulations or laws such as the GDPR, CCPA, or HIPAA are meant to address this problem. They help organizations implement and maintain best data protection practices, and help ensure that whoever an organization works with can keep information safe.

While GDPR and CCPA are concerned with any Personally Identifiable Information (PII), HIPAA is concerned with Protected Health Information (PHI).

 

What is HIPAA?

Since 1996, the Health Insurance Portability and Accountability Act (HIPAA) has been making sure that our healthcare data is protected under US law. In order to ensure HIPAA compliance, any company that deals with such data, often referenced as electronic protected health information (ePHI) must have technical and nontechnical safeguards in place to secure such information.

Also, anyone providing treatment, payment, and operations in healthcare, and any other business or entity that has access to PHI and provides support in treatment, payment, or operations must meet HIPAA Compliance (and ensure that their subcontractors are in compliance as well).

 

How much does HIPAA cost?

In 2013, the U.S. Department of Health and Human Services estimated HIPAA implementation would cost all covered entities (CEs) between $114 million and $225.4 million. This estimate, of course depends, on multiple variables affecting the organization such as type, size, culture, technical environment, and dedicated HIPAA resources.  What can be confirmed are the fines for noncompliance which could range from $100 to $50,000 per violation, with a maximum fine of $1.5 million per violation category per year.

 

You can trust Talend: we are HIPAA compliant

In order for Talend to sell to entities which would process ePHI with our products, we have to be HIPAA compliant too. It is as simple as that.

Over the past year we have been working hard to become HIPAA compliant and on February 11 of this year, we announced in an official press release that we had qualified as a business associate under the HIPAA and had become certified under the of EU-U.S. Privacy Shield.

With these compliance standards met, we will now be able to expand our work within the healthcare industry and assist more clients that may handle ePHI.

 

What are the benefits of HIPAA compliance for our customers?

Naveen Venkatapathi, the president of Talend partner Wavicle Data, which jointly creates solutions for Talend’s customers (including those in the healthcare industry), is delighted by the new certification. He notes, “By incorporating technical, physical, and administrative safeguards to protect PHI, Talend has made it much easier and safer for customers to get a complete view of patient care or provider operations, for example.”

He points out that the benefits of Talend for healthcare data integration include:

  • Comply with HIPAA and HL7 healthcare standards
  • Reduce development time, reduce manual coding errors which will reduce support costs
  • Avoid data loss and penalties
  • Improve data quality
  • Standardize integration of health data
  • Enable analytics with reliable data in the data warehouse.
  • Meet regulatory requirements with HIPAA compliant data

 

Naveen also notes that with the new certification, “we can implement customer solutions faster and can rely on the compliance Talend provides out-of-the box. We can also use the flexibility that Talend provides to configure and develop any unique custom features needed for the customer solution and still stay compliant with HIPAA.”

 

A HIPAA-compliant customer solution

Wavicle Data and Talend worked together to develop a solution to solve a complicated analytics landscape in a medical device manufacturing company that does business with distributors, clinics, hospitals and patients.

The customer had grown by acquisition of several other companies and found itself with nearly a dozen ERP systems. Getting company reports and analytics from this data was complicated and slow.

Wavicle Data worked with the customer to build a cloud-based Redshift data warehouse to aggregate and standardize the data from these many systems. Since direct patients are involved, the solution must be HIPAA-compliant to ensure patient PHI and personal information is safeguarded. They wanted a data integration solution to move data from many siloed systems into the data warehouse and it had to be HIPAA-compliant from the start.

This customer chose Talend as the data integration and integrity solution because it was HIPAA-compliant — with pre-built components for HIPAA, data privacy, and data security alongside EDI capabilities. This made it easy to build a solution that is HIPAA compliant from end-to-end. With Talend’s help, the customer was able to integrate with trading partners securely in EDI formats.

 

Beyond HIPAA: our commitment to Security and Privacy

In order to maintain HIPAA compliance, we will ensure that reasonable and appropriate technical and non-technical safeguards are in place.

HIPAA compliance is just one of the standards that we want to offer to our customers as guarantees that they can trust us with their data (ePHI or other).

As Chief Information Security Officer (CISO) I will make sure that we lead with Security and Privacy.

Stay tuned for more Security and Privacy updates from us.

 

The post You can trust us: we are HIPAA compliant appeared first on Talend Real-Time Open Source Data Integration Software.

Talend’s COVID-19 Safety and Business Continuity Plan: An executive message

$
0
0

In times of uncertainty, we will turn to those we trust for support. At Talend, we value the trust that our partners and our customers have in us and we are committed to fully supporting them and their business success.

We continue to be vigilant in monitoring the current Coronavirus Disease 2019 (COVID-19) pandemic and how it may affect our customers, our employees, our families, and our global communities.

We’re privileged to be a trusted partner to thousands of companies worldwide. To continue providing them with the high level of service they expect, we know we must be at our best. Below are a few of the measures currently in place at Talend:

People
As a global data company, we’ve already built into our business plan virtual capabilities, and nearly all of our competencies are resourced in multiple regions globally. As an example, for consulting and support, staff is distributed such that one outbreak within a team would not prevent continued operations.

We’ve asked all Talend employees who have recently traveled and have concerns regarding exposure to contact their medical providers for guidance. In addition, we’ve asked that all employees immediately call their supervisor and not come into an office or go to a client site or meeting if they believe that they’ve been exposed to someone who has been diagnosed with or is suspected of having COVID-19.

Out of an abundance of caution, we’re asking that all non-essential business travel be postponed and limited to travel within each region. We will continue to support our customers’ needs, so we realize that some in-person meetings are essential to delivering our services. We’ll use videoconferencing where available and when preferable to in-person meetings. Furthermore, employees are working remotely where possible.

Systems
To ensure the availability of essential services, Talend has a comprehensive Business Continuity Planning Policy supporting business continuity, business disruption preparedness, and total business recovery, with global and regional components. Each Talend office has plans in place that will enable us to effectively service our customers in the event an emergency arises. We’ve maximized our ability to provide the best network and system performance in the event any of our offices need to close. We’re communicating with our vendors to understand their business continuity plans, with a focus on continuity of services and product availability.

Our Continued Customer Commitment
Access to data you can trust at a time when you’re managing uncertainty and risk is of paramount importance. We’re here to support your business. We strive to continue to proactively respond and communicate with customers and employees and remain committed to being a responsible partner to the communities in which we operate.

We’re grateful to our customers and remain committed to their success. If you have questions or need anything please reach out to me or Jamie Kiser, our Chief Customer Officer, at jkiser[@][talend.com].

We wish you and your loved ones good health and safety.

The post Talend’s COVID-19 Safety and Business Continuity Plan: An executive message appeared first on Talend Real-Time Open Source Data Integration Software.

How Data Is Transforming the Fight Against Pandemics

$
0
0

fighting pandemic with data

 

The more time I spend working with data, and watching how our customers work with data, the more convinced I am of two things: 1) the power to do extraordinary things is embedded within data and 2) all of us working or dealing with data have a role to play in using our knowhow and technology to apply data to benefit humanity and tackle some of the biggest challenges of our lifetime – the environment, equality, education, health and safety.

Right now, we live in uncertain times and face challenges that we haven’t had to deal with in many decades. I believe in the power of human ingenuity to win the fight against the worldwide COVID-19 pandemic. And I believe that data has the power to enable people to reach their full potential in uncertain environments. The answers to complicated questions are all there in the data — we just have to find them, and we have to do so while respecting the privacy, security, and dignity of patients and healthcare workers.

It’s an enormous responsibility. And I’m excited and fascinated by the work of so many companies and public service organizations who are taking on this challenge of fighting this pandemic with information.

Our secret weapon — massive datasets

One of the hallmarks of trying to solve a viral pandemic is that it is a fight against time as well as a virus. We are always struggling to keep up against the curve of infection, and we start out behind.  That’s why understanding data is so critical to fighting this virus — being able to comprehend its structure, its spread, and effective treatments gives us the advantage of time as scientists work to get it under control. The faster we understand the virus, the faster we can treat patients and develop vaccines.

What’s fascinating to me about the medical and biotech community’s fight against COVID-19 is that the advances in artificial intelligence and machine learning are giving us an enormous advantage against this virus. We are able to crunch massive amounts of data in order to find the information we need. Fighting pandemics has always depended on careful observation of behavior and meticulous attention to documenting events. But now, instead of depending on a small number of observations, we can use huge datasets to give us an advantage in the war against both time and the virus. Here are a few examples of how that’s working:  

1) Google’s DeepMind AI Unit has been using its AI learning models to share understanding about the protein structures of COVID-19. DeepMind is using a machine learning method called “free modeling” to compare viral protein structures; DeepMind’s findings hope to cut down on the months of effort it usually takes to determine the protein structure of a virus, getting us to treatment and a cure faster.

2) Boston Children’s Hospital has developed an infectious disease tracking platform called Healthmap, which uses data confirmed by public health agencies around the world. It tracks the spread of COVID-19 in real-time and has proven invaluable in understanding how the disease is transmitted. But even more importantly, says project manager Kara Sewalk, her team needs to have a “reliable dataset to use after the outbreak is over, for epidemiological research to help prevent or minimize future outbreaks.”

3) Hospitals in both China and South Korea have used AI systems to diagnose coronavirus symptoms, getting the diagnosis more quickly and alleviating the shortage of testing kits.

How data will prevent pandemics in the future

There seems to be no doubt that the coronavirus will change a lot of things in our society. Human beings are social creatures, and that won’t — and shouldn’t — ever change. But there is a lot of room for technological innovation to eliminate needless transmission points for infectious agents.

For example, it makes sense to deploy robotic systems to sterilize hospital equipment and deliver food and medicine in hospitals. Several robotics companies are developing and repurposing robots for work in clinical settings.

When healthcare systems get overwhelmed, it isn’t just the frontline staff who are affected — though all of us are enormously grateful for their work. Backoffice functions like processing claims and billing can get overwhelmed as well, slowing down efficient treatment. Blockchain technologies have shown promising progress in streamlining these processes.

Data quality is essential for understanding how pandemics like COVID-19 grow and spread. Partnerships between data modelers and public health organizations are going to be critical in the coming years if we are to understand and stop the transmission of novel viruses. These partnerships have been instigated by the spread of coronavirus, and as they grow and deepen, the quality and accuracy of these models will only improve.

Data is the future — if handled well

It seems clear that data is an essential tool in the fight against COVID-19. I am a firm believer in people’s creativity and ingenuity, and I believe that access to data has proven to be an important way to navigate our way through these difficult times. Data has the power to unlock potential and lead to great discovery, and the advances in data technology are making the fight against infectious disease evermore successful.

Learn more from the Centers of Disease Control and the World Health Organization on what we know about COVID-19, the novel coronavirus.

Please stay safe and healthy.

The post How Data Is Transforming the Fight Against Pandemics appeared first on Talend Real-Time Open Source Data Integration Software.

Capturing data intelligence at first sight with Talend Data Inventory

$
0
0

Think about your experience when you book a hotel room, order a taxi, or purchase something online. You reach the best offer in a few clicks and you get additional guidance with ratings so you can predict the quality of the goods or services you’re buying. It’s really helpful to have all that additional information. So why don’t we get a similar experience when consuming data?

Well, now you can. Welcome to Talend Data Inventory. This is our new cloud native application within Talend Data Fabric. It is currently accessible in our Early Adopter Program and in General Availability in April in Talend Cloud on AWS, and then later in Q2 on Azure. It accelerates data-to-value with automated and searchable dataset documentation, quality proofing, and promotion. It captures data silos across data sources, targets them, and then turns them into reusable, shareable data assets with a single point of governance and access.

With Data inventory, datasets are augmented with an automatically calculated Trust Score that delivers an instant assessment of your data health and accuracy, based on data quality, data popularity and user-defined ratings. This means that not only does Data Inventory accelerate access to data but also allows you to assess its relevance and trustworthiness at first sight.

See below how Data Inventory visualizes available datasets with their Trust Scores, User Ratings, Data Quality Ratings, Endorsements…

 

Data Inventory

 

Using advanced search technologies, Talend Data Inventory delivers immediate access to the right data for anyone and cuts the times it takes to find and consume the data you need. As you can see below, you can filter data based and their trust score, average ratings, etc.

 Imagine a user who wants to find data in the customer domain with the highest trust score for his analytics, or the data owner of this domain who wants to find the data with the lowest trust score to delete or curate them.

 

Data Inventory

 

Data collaboration now becomes something that everyone can do. Anyone — from data practitioners to data consumers — can tag, rate, or comment on data sets exposed in Data Inventory. Data owners can also add descriptions, endorsements, and custom attributes.

In addition to automatic data profiling and crowdsourced ratings, Data Inventory also monitors the popularity and shareability of the data.  Leveraging all this information makes it super easy for a data consumer to find and search the perfect and trusted data based on his needs.

Once a user discovers a dataset, he gets access to the data set ID Card shown below, which brings even deeper insights to the underlying data so he can focus on what’s essential: extracting value out of the data.

 

Data Inventory

But this experience doesn’t stop at merely finding the data. Because Data Inventory is tightly integrated with Pipeline Designer and Data Preparation, everyone can dive into the data, taking immediate actions on the datasets and turning them into valuable insights.

See below how the same dataset inventory pops up across the new Data Inventory “stand-alone” application, Pipeline Designer, and Data Preparation.

Data Inventory

Using Data Inventory, making data meaningful and trustworthy suddenly becomes effortless, systematic and automatic. Any data professional can easily find and leverage trusted data sets to transform previous inert and siloed data sets into actionable and valuable insights. Data Inventory reinvents collaboration for the benefits of data- driven organizations. 

 

Learn more about Talend Data Inventory here or contact me through LinkedIn if you want to try it out as an early adopter before its general availability by end of April.

 

Still curious to see Talend Data Inventory in action?

 

Think Data Inventory is cool?
See all the new components and features Talend Data Fabric brings in our latest Winter ’20 release!

 

The post Capturing data intelligence at first sight with Talend Data Inventory appeared first on Talend Real-Time Open Source Data Integration Software.

How Talend is joining the fight against COVID-19: unlocking the best data for health researchers

$
0
0

The novel coronavirus, COVID-19, presents challenges the world hasn’t seen for decades. Humans have fought global pandemics before, and it isn’t easy. But we have an additional weapon on our side this time — data.

Data helps researchers understand the spread of the disease, how it is transmitted, and the rate at which transmissions occur from the initial infection. Data is invaluable in defeating this virus. But researchers face unique challenges in working with health care data. New files are added to public health databases each day. Getting them aggregated is one big challenge, and after files are joined together, the data must be matched up to ensure time series are accurate (e.g. incidents by date, or by location). In many cases, data must be cleaned up as well.

Health care data professionals and other researchers need data of the highest quality and accuracy. In addition, the cleaner and more accurate the datasets are, the faster they are to ingest and work with.

That’s where Talend steps in.

What’s currently in the COVID-19 datasets?

In collaboration with developers from the Singer open source community, a joint team from Talend and Bytecode has created a tool to ETL COVID-19 datasets. We standardize the data, augment it with metadata, then route the results to a data warehouse or data lake: Amazon Redshift, Amazon S3, Snowflake, Microsoft Azure Synapse Analytics, Delta Lake for Databricks, or Google BigQuery. Data engineers and scientists can run the tool on their own infrastructure or use Stitch for free.

The COVID-19 integration covers several datasets:

The data stored in these repositories lacks a common format. For instance, the EU Data comprises data from different countries, and the header names for the same type of data differ. Even slight changes like these require data professional take extra time and steps  to cleanse and standardize data. Having these datasets processed through our ETL gives users guaranteed consistency for this data so they can focus more on their models or visualizations and make faster and more confident decisions.

How the COVID-19 dataset works

The tap utilizes the GitHub V3 API library to query and retrieve files stored in multiple GitHub repositories. Users must get a GitHub token, which allows the tap to increase the number of API calls. Users can then select one or all of the supported datasets and the fields associated with them, select one of the Stitch destinations, and select the frequency of the loads. Given that the data is typically updated more than once a day, we suggest a frequency of every 6 to 12 hours, but you can choose more frequent replication.

COVID-19 dataset

How to access and explore the COVID-19 integrated dataset

These datasets should be beneficial to anyone doing health research.  Interested researchers can run the data import for free on the Stitch platform. Here are all the options for accessing the data or joining the effort to further build out this dataset.

GitHub repo: https://github.com/singer-io/tap-covid-19
Singer tap: https://www.singer.io/tap/covid-19-public-data/
Stitch integration: https://www.stitchdata.com/integrations/covid-19/ 
Documentation: https://www.stitchdata.com/docs/integrations/saas/covid-19

 

Finally, the entire dataset is available in a read-only Snowflake data warehouse found at

https://nxa21939.us-east-1.snowflakecomputing.com/console/login#/

Log in with username `covid_user` and password `analysis`.

 

COVID-19 integrated dataset

 

The post How Talend is joining the fight against COVID-19: unlocking the best data for health researchers appeared first on Talend Real-Time Open Source Data Integration Software.

Talend’s Next Chapter

$
0
0

 

Today we open a new chapter at Talend, in which we begin our journey from a $250M company to a $1 billion cloud market leader. Over the last six years, I’ve been honored to help build and lead the team that brought Talend from a $50M startup, through its IPO in 2016 to become a quarter-billion-dollar company. Together, we built one of the fastest-growing cloud businesses in the world. Now it’s time for me to take a step back and welcome our new leader who will help propel Talend through its next growth phase. As of today, I’ll be handing the reins over to Christal Bemont, who will become Talend’s new CEO.

 

I’m thrilled to welcome Christal to Talend. She joins us from SAP Concur, where she was most recently its CRO and responsible for leading the global go-to-market team that grew its business into a multi-billion-dollar cloud market leader. Prior to being CRO, she spent 15 years at Concur in an expanding series of sales and leadership positions. During that time, she was instrumental in shaping the company’s go-to-market strategy, growing its largest global clients, and leading multiple sales teams to success at scale. She brings exceptional leadership skills along with unique expertise in scaling a large cloud business. This unique leadership experience is exactly the skill set we need to grow Talend into the $1B cloud market leader that I know we can become.

 

To win the lion’s share of the exploding cloud data market, we also need to double down on our customer-first strategy. To spearhead that effort, we’re bringing in Ann-Christel Graham to the newly-created position of Chief Revenue Officer and Jamie Kiser as Chief Customer Officer. AC and Jamie worked closely together with Christal at SAP Concur and bring significant cloud industry go-to-market expertise.

 

What does this new phase mean for our customers and partners? Talend has come this far in large part because our mission has always been to help our customers eliminate roadblocks in their journeys to becoming data-driven. As more companies migrate to the cloud, we believe those challenges will grow and become more complex. After all, data is everywhere: it’s being generated by every system that companies use to power their businesses, and it’s being collected at every customer touchpoint. Properly deployed, data can redefine a company’s fortune and future, but it’s often inaccessible, frequently bad, and in the case of customer data, extremely risky to manage.

 

As we pursue our next phase of growth and cement our leadership position, we’ll continue developing a dynamic roadmap that removes the roadblocks data-driven customers face. As always, we’ll play a vital role in ensuring the data that companies use to make critical business decisions is accessible, clean, and compliant—and we’ll help companies do so wherever they are in their data journey. As we have over the past several years, we’ll also work to ensure we’re delivering not just the best product set for our customers, but also the best service and support capabilities worldwide.

 

We believe the cloud will drive the majority of our future growth, and the progress we’ve made in the past year positions us for unprecedented success. Our cloud business continues to grow well over 100%; it’s now over half of what we sell and we’re solving our customers’ most demanding cloud data requirements. We have an incredible team, a great product set and, in Christal, Ann-Christel and Jamie, new leaders with the expertise ideally suited to help us meet our goals. Our new team members have the go-to-market, cloud and leadership skills needed to tackle the challenges ahead. As a board member, I’ll continue to participate in Talend’s business and welcome you to stay in touch and let me know how your data journeys are progressing. I’m sure you all share my excitement and optimism for the future of the company and will welcome Christal, Ann-Christel, and Jamie in their new roles.

 

The post Talend’s Next Chapter appeared first on Talend Real-Time Open Source Data Integration Software.

Enabling Olympic-level performance and productivity for Delta Lake on Databricks

$
0
0

Databricks lakehouse performance

 

Recently, Databricks introduced Delta Lake, a new analytics platform that combines the best elements of data lakes and data warehouses in a paradigm it calls a “lakehouse.” Delta Lake expands the breadth and depth of use cases that Databricks customers can enjoy. Databricks provides a unified analytics platform that provides robust support for use cases ranging from simple reporting to complex data warehousing to real-time machine learning.

If you use Delta Lake, you have just one place to put data and one place to deploy your jobs, making your architectures more streamlined. And for Databricks users, as with any technology developers who want things streamlined, a tool like Talend Data Fabric that automates processes can become a “killer app” thanks to the improvement it brings in things like productivity, manageability, and governance.

The ‘magic’ of Delta Lake

Databricks is well-known as the team that invented Spark and the first company to commercialize it in a cloud-based environment. Not resting on those laurels, Databricks took another leap forward with Delta Lake, which expands Spark’s processing power with robust database capabilities to serve data to downstream systems. With the introduction of Delta Lake, Databricks has introduced a new file format called Delta that allows for ACID transactions, data history (a.k.a. time travel), schema enforcement (i.e. a string field must be a string), schema evolution, and the ability to continuously stream data including updates.

Using Delta Lake, the general pattern is to have a “multi-hop” architecture that lands data into ingestion tables (Bronze), then refines those tables with a common data schema and standardized data for consistency (Silver), then publishes that data as features or aggregated data store (Gold).

Data professionals encounter unique challenges at each of those steps, which are represented in this figure.

 

3 levels of Data Lake

Figure 1: Image from http://delta.io

Let’s talk about the requirements for building a trusted lakehouse and how Talend can help you prepare and publish trusted data to all downstream users.

 

Building a lakehouse with Talend – Earn your Olympic medal

The figure above shows three logical data stages:

  • Bronze data ingested from external systems into a raw or landing zone
  • Silver data, refined as tables in a common data model in which the data conforms to quality expectations
  • Gold data, ready to finalize and apply in data science models and machine learning techniques

 

Bronze — ingestion

Most organizations have hundreds of sources they might need data from. Some are simple, like .CSV files, while others are complicated API-based integrations like Salesforce and Marketo. To populate  a lakehouse for historical analytics, you need to perform a one-time ingestion (bulk load) of existing records.  If you want to keep your data current, as most organizations do,  you need to replicate incremental changes to the lakehouse on a scheduled basis, moving only new or changed data each time.

Remove the barriers to data ingestion

Stitch Data Loader for Databricks Delta: This month, we added Delta Lake on Databricks as a supported destination for Stitch Data Loader. Stitch is a cloud-first platform for simple data ingestion and replication.

All Databricks users now have a very quick, easy, and reliable way to ingest data from more than 100 SaaS and database sources into their lakehouses. It take only minutes to set up a Stitch replication job and begin loading data into Delta Lake. There’s no coding, nothing to install —Stitch makes data ingestion easy for everyone, including nontechnical users.

If you’re building BI reports on the raw “Bronze” data, you‘re ready to go.

 

Silver – refined tables

At the “Silver” level your data is ready to use in business intelligence and enterprise reporting. It’s in a common data model and the data content itself has gone through your data quality rules, ensuring that it meets your organization’s expectations for trusted data.

However, most of the time, you’re dealing with more than one data source. and two or more sources seldom have the same schema. You need a common schema, and that requires a mapping exercise to have the data from the ingestion tables match the schema of the refined tables. Once you have a common data structure, you have to consider the issue of data conformity. Is the data standardized?

Is a State field a two-letter abbreviation (e.g. NY) or is it spelled out (New York)?

Is Phone Number standardized (e.g. for US number is it (415) 555-1212 or 415-555-1212 or 4155551212)?

These simple examples; things can get more complicated when it comes to internal data (e.g. sales stages, product codes, customer statuses, etc.).

Assuming you have a data structure that conforms to a standard, what about duplication — especially for reference data, master data, and dimension data?

Although you can find duplicates easily when things field contents are exactly the same (John Smith in CRM and John Smith in your billing system), what happens when data sets differ, yet refer to the same person, organization, place, or thing? For example, John Van Ofen who lives at 123 Hanover St and Johan Vanoven who lives at 123 Hanover Strasse could be the same person.

Data must be restructured to a common format. It must meet organizational standards and it needs to be deduplicated.  Talend Data Fabric, with its suite of transformation and data quality components, can standardize the data and automatically correct it, ensuring your lakehouse data is refined enough to be at the “Silver” level.

  See our data quality documentation for a full description of our data quality features, or download the Definitive Guide to Data Quality to learn how to stop bad data before it enters your system.  

Much like a toolbox, Talend Data Fabric allows data engineers to quickly find components that can solve their data transformation or data quality challenges.

 

Gold – feature engineering and aggregate data store

Once you have all the data loaded into a common structure, you should trust the quality of that data. Now, the final step is to identify the features, the  field, or combination of fields to use in your data science and machine learning algorithms. At this step you can create aggregations that simplify model development.

Aggregation

Aggregation is a common function in data analytics.

Talend offers many aggregation components, including count, min, max, sum, average, median, mean, and others, and uses mappings to apply the components to perform these calculations.

Feature engineering

Feature engineering is an integral step in machine learning. Ensuring that your machine learning models have measurable properties or characteristics (called features) will not only makes the data compatible for the model, it improves the accuracy and performance of the model.  Feature engineering is the process of transforming, standardizing, and preparing the data for the ML model.

There are many aspects to feature engineering, and although this article will not attempt to describe them all, these techniques are well described in “Fundamental Techniques of Feature Engineering for Machine Learning”.

This article highlights the key Feature Engineering functions needed.  These include:

  • Imputation – filling in values that are not present in source
  • Handling Outliers – finding and disregarding values that are outliers
  • Binning – grouping data into common “bins” (i.e. converting values to “High”, “Medium”, “Low” groupings)
  • Log Transform – standardize numerical data to correct for magnitude differences
  • One-Hot Encoding – turn categorical data into a table of 1’s and 0’s making it easy to consume for a ML model
  • Grouping Operations – Organizing the data into a pivot table
  • Feature Split – Decomposing a value into constituent parts (i.e. full name into first, middle, last)
  • Scaling – normalizing numeric values into a range between 0 and 1 and standardizing the scales by considering standard deviations
  • Extracting Date – identify the day, month, year, time between dates, holidays, weekends, etc.

As I look at each of these tactics in feature engineering, I keep checking them off as transformation functions or data quality functions that are provided in Talend Studio as “out of the box” components. In fact Talend offers many more ML prep components and a full description can be found in our online documentation.

 

Talend supports key Databricks infrastructure

Delta Lake users need the ability to leverage the Databricks platform and the core Spark and Delta Lake technologies.

Talend has been the leader in native Spark code generation dating back to the first commercial releases of Spark. Native code generation ensures that the logic defined in Talend Studio translates to the highest performance execution while ensuring that code follows the standard practices of Databricks.

With the latest release, Talend adds production-level support for reading and writing Delta  tables. Because Data Fabric supports Spark DataSets and DataFrames, Talend jobs can attain the highest performance possible with the easiest way to define workloads.

Power to the people

We surveyed hundreds of Talend customers using Databricks and asked them what data sources they want to load into a lakehouse, whether they want to make a one-time copy or they have an ongoing replication need, and what user profile they want to perform this work.

More than 70% of the respondents said they wanted data from cloud-based sales and marketing applications such as Salesforce, Marketo, and Google Ads. They said they want to do an initial bulk load of data, then keep the lakehouse refreshed at least every day; many said they wanted data refreshed at least every 15 minutes. Lastly, while data engineers made up a significant number of responses for the roles that asked for this, even more data scientists and business analysts were interested.

 

Summary

With built-in capabilities to ingest data into Bronze tables, refine that data into Silver tables, and finalize that data for data science and ML into your Gold tables, Talend provides the complete breadth of functions to build and maintain data pipelines. With the simple configuration setting that targets Databricks, you can deploy any level of data pipelines, data quality, and data governance easily. With Talend, you have an end-to-end data management platform to support BI analytics, data engineering, data science, and machine learning use cases .

 

 

The post Enabling Olympic-level performance and productivity for Delta Lake on Databricks appeared first on Talend Real-Time Open Source Data Integration Software.


Talend Named a Leader in the Enterprise Data Fabric, Q2 2020 Forrester WaveTM

$
0
0

We are happy to announce Talend is a Leader in The Forrester Wave™: Enterprise Data Fabric, Q2 2020.  Talend’s unified approach to data management – combining data integration, integrity, and governance in a single platform – is the best way to gain clarity and confidence in your data.  Since we launched Talend Data Fabric in 2015, we’ve been strong believers that data integration and management could not be solved with a static, siloed enterprise software solution. Companies and data users need a solution that is flexible, can grow with their needs, and is unified enough that the entire organization can use and benefit from it.  But that is a lot, so I want to unpack all this a bit for you today. 

 

Forrester Wave

 

Let’s start out with a deeper dive into what challenges an Enterprise Data Fabric can solve.  According to Forrester, traditional approaches to data integration cannot address “new business requirements that demand a combination of real-time connected data, self-service, and a high degree of automation, speed, and intelligence. While collecting data from various sources is often straightforward, enterprises often struggle to integrate, process, curate, and transform data with other sources to deliver a comprehensive view of the customer, partner, product, and employee.”

 

We could not agree more. If you are building a customer 360 solution or an end-to-end marketing campaign solution, you will not be able to do this effectively with a bunch of siloed tools and stand-alone applications. You need a full Enterprise Data Fabric not only to manage the data, but also to govern the full platform, including the metadata and processes. We have seen many customers embrace this approach, including AstraZeneca and Euronext. In a heavily regulated environment, AstraZeneca devised a strategic initiative of returning to growth galvanized around data consolidation and transformation. Using Talend Data Fabric, they were able to facilitate point-to-point connections with APIs, harness their metadata with a Data Catalog, trust their data with Data Quality, and automate their data pipelines – hence dramatically lowering costs and risk. < link to >

 

We believe Talend has been able to sustain its place as a market leader by focusing on where data integration and data management is headed rather than where it’s been.   We are creating an easy system for everyone to verify that the data they are betting their business on is data that they can trust. We are creating a connection between the analytics running the business and the systems, checks, and balances ensuring the quality and compliance of the data powering them.​  Our Data Trust Score delivers an instant assessment of your data health and accuracy, based on data quality, data popularity, and user-defined ratings. It allows you to assess the relevance and trustworthiness of your data at first sight.​  And a laser focus on cloud – deployment of Talend in the cloud, as well as support for hybrid and multi-cloud environments – has been another key area of emphasis for Talend for the last several years.  That’s why we have continued to build Talend Data Fabric – our unified platform for data integration, governance, and sharing – on-premises and in the cloud. And, we have progressively added more and more machine learning-based capabilities to automate data quality tasks, make data pipelines more intelligent, and enable more non-technology users to gain self-service access to data.  

It is this focus on continuous improvement and a willingness to change – and change quickly – that has, in our opinion, helped Talend become a firmly established Enterprise Data Fabric leader in a short period of time. Talend received the highest score of any vendor in the report in the Current Offering category, and earned the highest possible scores in criteria like data quality, data lineage, and data catalog, as well as for our roadmap.  We believe these high scores continue to validate that Talend’s vision and mission are what you as an enterprise need to be successful in your pursuit of an Enterprise Data Fabric.  You can read the full Forrester report here.

 

The post Talend Named a Leader in the Enterprise Data Fabric, Q2 2020 Forrester Wave<sup>TM</sup> appeared first on Talend Real-Time Open Source Data Integration Software.

Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions

$
0
0

 “Every organization — no matter how big or how small — needs data quality,” says Gartner in its newly published Magic Quadrant for Data Quality Solutions. However, with more and more data coming from more and more sources, it’s increasingly harder for data professionals to transform the growing data chaos into trusted and valuable data assets. Data pipelines may carry incomplete and inaccurate data, making data practitioners’ jobs difficult and preventing data-driven initiatives from delivering on their expected business outcomes.

We all aspire to transform our organizations with data-driven insights, but we can’t do that if we don’t trust our data. A recent Opinion Matters survey shows that only 31% of data specialists have a high level of confidence in their organizations’ ability to deliver trusted data at speed.

Working without reliable data becomes costly, risky, and chaotic. Whether you’re unifying product and customer data in a single 360° view to transform the customer experience, or you need to comply with data privacy regulations, data quality can make the difference between the success and failure of your data-driven initiatives.

 

No data management initiative is complete without a solid data quality strategy

Data quality can dramatically impact your bottom line. Gartner stated that its “Magic Quadrant customer survey shows that organizations estimate the average cost of poor data quality at $12.9 million every year.” Another Gartner report also positions data governance and data quality as the most important initiatives for data management strategies.

As data quality is becoming a linchpin of data management, we’re proud that Talend was recognized by Gartner as a Leader for the third time in a row in the 2020 edition of Gartner’s Magic Quadrant for Data Quality Solutions. 

We believe data quality shouldn’t be managed by a standalone solution. Rather, data quality is a core discipline within data management. It should span out everywhere, and this requires integration and extensibility.  

Talend Data Fabric delivers data quality as a pervasive capability that spans across our platform and related applications, and that includes self-service data preparation, data integration, real-time integration, metadata management, and a data catalog.

We believe, being recognized as a Leader in the Magic Quadrant for Data Quality Solutions not only validates our capacity to build a vision for data quality, but also validates our ability to help organizations succeed in their digital transformation journeys.

gartner magic quadrant 2020 data quality

Download a complimentary copy of the 2020 Magic Quadrant for Data Quality Solutions.

 

4 innovations that make the biggest impact on data quality

The research also considers the technologies and innovations in the data quality market. Let’s review those key ingredients and see how Talend addresses them.

Ubiquity: horizontal, not vertical data quality

Talend has made data quality a key component of its data management vision for a decade; we have been positioned in this Gartner Magic Quadrant since 2011. Talend has always considered data quality the key to making any data management project a success.

We embed data quality into every step of the data pipeline by making Talend Data Quality an integrated part of Talend Data Fabric instead of a standalone application, so that customers can get data they trust at every stage of the data lifecycle.

 

Simplicity: democratizing data quality with simple, efficient, collaborative data systems

Data practitioners need simple, intelligent, automated data quality tools to transform data chaos into valuable, reusable data assets.

Talend was among the first contenders to cover that need. Talend introduced self-service data preparation tools in 2016, bridging the gap between IT capabilities and business needs. The following year, Talend entered the Magic Quadrant for Data Quality Solutions as a Leader. Today, Talend Data Fabric provides a unified, collaborative platform in the cloud on which nontechnical users can profile, contribute, and improve data, removing the hassle of legacy on-premises systems.

 

Automation: data quality made intelligent

Amplifying data quality with machine learning has become a key differentiator. “By 2022,” Gartner predicts, “60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement.” Business users need help to accelerate preparation for better data.

To that end, Talend recently introduced more machine learning-driven features, such as Magic Fill to accelerate data preparation and let users process data quicker and better.

 

Collaboration: bring the people expertise back into the data

Still, while automation is important, it’s not the answer to everything. Data quality success often stems from the right alliance of people, technology, and processes aligning with each other to make an impact. People must remain in control, and human expertise must be captured and employed in the data chain. To capture that knowledge, another component of Talend Data Fabric, Talend Data Stewardship, helps organizations assign data validation to appointed experts across the organization and track and audit progress.

Talend’s stewardship capabilities were highlighted by our customers in the previous Magic Quadrant, and continue to provide value to customers. That’s why we made Talend Data Stewardship a key part of our Talend Data Fabric, letting organizations not only offer that functionality, but also engage users in a virtuous circle with their data.

 

Companies rely on data quality to deliver successful data strategies

We’re witnessing these innovations and new needs firsthand and are proud to support our customers on their journey to data quality.

Talend customer: Seacoast BankTake the example of Seacoast Bank, which created a data quality index for all their financial services. Seacoast Bank relies on data to be able to provide customers the best solutions for their needs, and to develop a deeper understanding of who their customers are and how they want to work with the bank. And being heavily regulated, Seacoast Bank also understands the need for trusted data. Seacoast Bank is banking on a data quality index to measure data quality across six dimensions and track how it improves or degrades as the bank acquires other banks, and as data sources, processes, and the technical environment change.

It’s our duty to make sure each customer’s data accurately reflects who they are in our community, and what their relationship is with our community-based bank.

Mark Blanchette
SVP, Director of Data Management and Business Technology, Seacoast Bank

 

 

Talend is working with a renowned telco operator that serves more than 90 million mobile subscribers. Our customer was facing huge data quality challenges that led to underperforming customer communications. They used Talend Data Quality to convert bad data into a steady stream of clean and reliable source data to power advanced analytics. This happens automatically every day, allowing data analysts, the operations team, and even business users to know if the data they are using is accurate and valid. Results were impressive: The company went from a 40% to a 90%+ trust score that saw better efficiency, cost reduction, risk protection, and higher ROI of marketing campaigns.

 

Everyone should know what’s inside their data, score it, and improve it over time

Gartner predicts that “by 2022, 70% of organizations will rigorously track data quality levels via metrics, increasing data quality by 60% to significantly reduce operational risks and costs.”

Talend brought data profiling into the hands of data engineers. Now that everyone wants to use data, it’s equally important to let data workers understand the data, endorse it, score it, and improve it.

Data Trust Score by Talend Talend Trust Score does just that. The Trust Score helps anyone to answer at a glance the question “How trustworthy is my dataset?” It’s based not only on data quality indicators, but also on popularity and certification, so that reliable and authoritative datasets can be shared and populated across the organization.

 

We’re still in the early stages of the data quality journey. Data management practices are constantly evolving, and we’re seeing capabilities converging into a unified platform that can meet the needs of both business departments and IT.

We’re happy to help. We thank all the customers who have placed their trust in Talend. And to anyone who wants to bring clarity to their data chaos, we invite you to discover Talend, try our data quality stack, and become part of our growing user community.

 

 

Gartner, Magic Quadrant for Data Quality Solutions, Melody Chien, Ankush Jain, 27 July 2020
Gartner, Survey Analysis: Data Management Struggles to Balance Innovation and Control, Melody Chien, Nick Heudecker, 19 March 2020
Gartner, Build a Data Quality Operating Model to Drive Data Quality Assurance, Melody Chien, Saul Judah, Ankush Jain, 29 January 2020 
Gartner, Magic Quadrant for Data Quality Solutions, “Melody Chien, Ankush Jain”, “27 July 2020”

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Talend.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Like the Infinity Stones, keep your Talend services as far apart as possible

$
0
0

Ok, so we all probably know why keeping all Infinity Stones in one place is a bad idea, right? You must now be wondering what the relationship between Infinity Stones and Talend could be. Worry not, Thanos isn’t coming and there is a reasonable explanation behind the MCU fandom references, I promise.

I like to use this analogy to emphasize how distributing your Talend services among different servers according to Talend recommended architecture guidelines is as important as keeping the Infinity stones scattered across the universe and away from the clutches of evil. Are you following me?

 

In this blog, we will explore some common customer questions which come up during Talend architecture discussions and our answers to those queries.

Most Common query during planning stage

One of the most common queries I used to get while interacting with customers:

What should be the right methodology to allocate various Talend services in our ecosystem? Shall I give you one gigantic server which can handle all our computational requirements?

After going through the blog, I am sure that you will find the answer to this query yourself. Before going to the details, we need to agree that the Cloud technology has given power to the customers to select any type of servers quickly and they can set the server up and running in a matter of minutes. This means, theoretically, you can put all the Talend services in single server for demonstrational purposes. But is it a right approach? Before making the final call, we will discuss about various factors surrounding this scenario.

 

Going back to Monolithic server approach

The world had already moved out of monolithic systems when I started my IT career (well, it doesn’t mean I am too old or too young ?). One of the tendencies we are seeing is that some users are trying to go back to same monolithic environment patterns by taking a different route. This time, the monolithic server concept is wrapped carefully with gift paper in the form of a single high computational server available in various Cloud environments. Customers often overlook the fact that those servers are available in Cloud for specific high computational use cases like graphics processing, scientific modeling, dedicated gaming servers, machine learning etc.

Talend services

 

From the Talend perspective, it’s always ideal to distribute various services (remember the Infinity Stones) to various servers as per recommended architecture.

 

Keeping all eggs in same basket problem

For argument’s sake, let’s put all the services of Talend for a Production environment in a single server. When a server failure occurs, this approach will bring down the entire Talend ecosystem.

 

This might be a costly affair especially if the enterprise is having specific data transfer SLAs between various systems. But if you are distributing the Talend environments in recommended architecture style, you can manage these types of failures in a graceful manner.

 

The battle for system resources

Still not convinced? Then I will take you to the next phase where the battle to grab system resources is happening. The basic scenario remains the same where you have installed all the services in single server.

 

Imagine you are going to pump a lot of data from source to target in batch format. We must also consider that multiple transformation rules need to be applied to the incoming data. This means your job server will be ready to grab a lot of available “treasure” (which is nothing more than system resources like CPU, memory, disk space etc). At the same time, you need to make sure that system resources are available for other services like TAC, Continuous Integration systems, Run time servers.

The tug-of-war for system resources will eventually lead to a big battle among various services. For example, let’s assume a case where TAC fails due to lack of memory. This means that you have lost all the control over the ecosystem and there is nobody to manage the services. If the victim is the run time server, it means your data flow through various web services and routes will start failing.

At the same time, if you start using a gigantic single server, you may not use the entire computational capacity all the time. It will be like gold plating all of your weapons and the result will be too much cost to maintain the underlying infrastructure.

 

Refreshing our minds with Talend Architecture

I hope, by now, all of you are convinced by the rationale of not keeping all the Infinity Stones Talend services in a single server. Before going further into detail on recommended architecture, let us quickly refresh our minds about various services involved in Talend Architecture. I will start with On-premise version. Below diagram will help you to understand the various services involved to handle both batch and streaming data. If you would like to understand more about each service, you can get the details from this link.

Talend Architecture

 

The Talend Cloud Architecture simplifies the overall landscape. You need to remember that you may still have to manage Remote Engines (either for Studio jobs or Pipeline Designer), Runtime Servers, Continuous Integration related activities etc.

Talend Architecture

 

If you would like to know more about Talend Cloud Architecture, I would recommend you have a look at the Cloud Data Integration User Guide.

 

Talend Recommended Architecture

The detailed description of Talend Recommended Architecture (including server sizing) for On-premise products can be referred from this link. I am not going to repeat the content, but I would like to show a high-level view of Talend recommended server layout for your quick reference.

Talend Recommended Architecture

 

The Cloud recommended architecture layout is much simpler since the Talend services are managed from Cloud environment. You can refer to the recommended Talend Cloud Architecture at this link. A quick peek of the server layout in the case of Talend Cloud is as shown below.

Talend Cloud Architecture

 

I hope this discussion on the “Infinity Stones” of Talend was as interesting for  you as it was for me ?. Until I come up with another clever analogy to write a blog around, enjoy your time using Talend and keep those “stones” safe!

 

The post Like the Infinity Stones, keep your Talend services as far apart as possible appeared first on Talend Real-Time Open Source Data Integration Software.

Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions

$
0
0

 “Every organization — no matter how big or how small — needs data quality,” says Gartner in its newly published Magic Quadrant for Data Quality Solutions. However, with more and more data coming from more and more sources, it’s increasingly harder for data professionals to transform the growing data chaos into trusted and valuable data assets. Data pipelines may carry incomplete and inaccurate data, making data practitioners’ jobs difficult and preventing data-driven initiatives from delivering on their expected business outcomes.

We all aspire to transform our organizations with data-driven insights, but we can’t do that if we don’t trust our data. A recent Opinion Matters survey shows that only 31% of data specialists have a high level of confidence in their organizations’ ability to deliver trusted data at speed.

Working without reliable data becomes costly, risky, and chaotic. Whether you’re unifying product and customer data in a single 360° view to transform the customer experience, or you need to comply with data privacy regulations, data quality can make the difference between the success and failure of your data-driven initiatives.

 

No data management initiative is complete without a solid data quality strategy

Data quality can dramatically impact your bottom line. Gartner stated that its “Magic Quadrant customer survey shows that organizations estimate the average cost of poor data quality at $12.9 million every year.” Another Gartner report also positions data governance and data quality as the most important initiatives for data management strategies.

As data quality is becoming a linchpin of data management, we’re proud that Talend was recognized by Gartner as a Leader for the third time in a row in the 2020 edition of Gartner’s Magic Quadrant for Data Quality Solutions. 

We believe data quality shouldn’t be managed by a standalone solution. Rather, data quality is a core discipline within data management. It should span out everywhere, and this requires integration and extensibility.  

Talend Data Fabric delivers data quality as a pervasive capability that spans across our platform and related applications, and that includes self-service data preparation, data integration, real-time integration, metadata management, and a data catalog.

We believe, being recognized as a Leader in the Magic Quadrant for Data Quality Solutions not only validates our capacity to build a vision for data quality, but also validates our ability to help organizations succeed in their digital transformation journeys.

2020 Gartner Magic Quadrant for Data Quality Solutions

Download a complimentary copy of the 2020 Magic Quadrant for Data Quality Solutions.

 

4 innovations that make the biggest impact on data quality

The research also considers the technologies and innovations in the data quality market. Let’s review those key ingredients and see how Talend addresses them.

Ubiquity: horizontal, not vertical data quality

Talend has made data quality a key component of its data management vision for a decade; we have been positioned in this Gartner Magic Quadrant since 2011. Talend has always considered data quality the key to making any data management project a success.

We embed data quality into every step of the data pipeline by making Talend Data Quality an integrated part of Talend Data Fabric instead of a standalone application, so that customers can get data they trust at every stage of the data lifecycle.

 

Simplicity: democratizing data quality with simple, efficient, collaborative data systems

Data practitioners need simple, intelligent, automated data quality tools to transform data chaos into valuable, reusable data assets.

Talend was among the first contenders to cover that need. Talend introduced self-service data preparation tools in 2016, bridging the gap between IT capabilities and business needs. The following year, Talend entered the Magic Quadrant for Data Quality Solutions as a Leader. Today, Talend Data Fabric provides a unified, collaborative platform in the cloud on which nontechnical users can profile, contribute, and improve data, removing the hassle of legacy on-premises systems.

 

Automation: data quality made intelligent

Amplifying data quality with machine learning has become a key differentiator. “By 2022,” Gartner predicts, “60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement.” Business users need help to accelerate preparation for better data.

To that end, Talend recently introduced more machine learning-driven features, such as Magic Fill to accelerate data preparation and let users process data quicker and better.

 

Collaboration: bring the people expertise back into the data

Still, while automation is important, it’s not the answer to everything. Data quality success often stems from the right alliance of people, technology, and processes aligning with each other to make an impact. People must remain in control, and human expertise must be captured and employed in the data chain. To capture that knowledge, another component of Talend Data Fabric, Talend Data Stewardship, helps organizations assign data validation to appointed experts across the organization and track and audit progress.

Talend’s stewardship capabilities were highlighted by our customers in the previous Magic Quadrant, and continue to provide value to customers. That’s why we made Talend Data Stewardship a key part of our Talend Data Fabric, letting organizations not only offer that functionality, but also engage users in a virtuous circle with their data.

 

Companies rely on data quality to deliver successful data strategies

We’re witnessing these innovations and new needs firsthand and are proud to support our customers on their journey to data quality.

Talend customer: Seacoast BankTake the example of Seacoast Bank, which created a data quality index for all their financial services. Seacoast Bank relies on data to be able to provide customers the best solutions for their needs, and to develop a deeper understanding of who their customers are and how they want to work with the bank. And being heavily regulated, Seacoast Bank also understands the need for trusted data. Seacoast Bank is banking on a data quality index to measure data quality across six dimensions and track how it improves or degrades as the bank acquires other banks, and as data sources, processes, and the technical environment change.

It’s our duty to make sure each customer’s data accurately reflects who they are in our community, and what their relationship is with our community-based bank.

Mark Blanchette
SVP, Director of Data Management and Business Technology, Seacoast Bank

 

 

Talend is working with a renowned telco operator that serves more than 90 million mobile subscribers. Our customer was facing huge data quality challenges that led to underperforming customer communications. They used Talend Data Quality to convert bad data into a steady stream of clean and reliable source data to power advanced analytics. This happens automatically every day, allowing data analysts, the operations team, and even business users to know if the data they are using is accurate and valid. Results were impressive: The company went from a 40% to a 90%+ trust score that saw better efficiency, cost reduction, risk protection, and higher ROI of marketing campaigns.

 

Everyone should know what’s inside their data, score it, and improve it over time

Gartner predicts that “by 2022, 70% of organizations will rigorously track data quality levels via metrics, increasing data quality by 60% to significantly reduce operational risks and costs.”

Talend brought data profiling into the hands of data engineers. Now that everyone wants to use data, it’s equally important to let data workers understand the data, endorse it, score it, and improve it.

Data Trust Score by Talend Talend Trust Score does just that. The Trust Score helps anyone to answer at a glance the question “How trustworthy is my dataset?” It’s based not only on data quality indicators, but also on popularity and certification, so that reliable and authoritative datasets can be shared and populated across the organization.

 

We’re still in the early stages of the data quality journey. Data management practices are constantly evolving, and we’re seeing capabilities converging into a unified platform that can meet the needs of both business departments and IT.

We’re happy to help. We thank all the customers who have placed their trust in Talend. And to anyone who wants to bring clarity to their data chaos, we invite you to discover Talend, try our data quality stack, and become part of our growing user community.

 

 

Gartner, Magic Quadrant for Data Quality Solutions, Melody Chien, Ankush Jain, 27 July 2020
Gartner, Survey Analysis: Data Management Struggles to Balance Innovation and Control, Melody Chien, Nick Heudecker, 19 March 2020
Gartner, Build a Data Quality Operating Model to Drive Data Quality Assurance, Melody Chien, Saul Judah, Ankush Jain, 29 January 2020 
Gartner, Magic Quadrant for Data Quality Solutions, “Melody Chien, Ankush Jain”, “27 July 2020”

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Talend.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Like the Infinity Stones, keep your Talend services as far apart as possible

$
0
0

Ok, so we all probably know why keeping all Infinity Stones in one place is a bad idea, right? You must now be wondering what the relationship between Infinity Stones and Talend could be. Worry not, Thanos isn’t coming and there is a reasonable explanation behind the MCU fandom references, I promise.

I like to use this analogy to emphasize how distributing your Talend services among different servers according to Talend recommended architecture guidelines is as important as keeping the Infinity stones scattered across the universe and away from the clutches of evil. Are you following me?

 

In this blog, we will explore some common customer questions which come up during Talend architecture discussions and our answers to those queries.

Most Common query during planning stage

One of the most common queries I used to get while interacting with customers:

What should be the right methodology to allocate various Talend services in our ecosystem? Shall I give you one gigantic server which can handle all our computational requirements?

After going through the blog, I am sure that you will find the answer to this query yourself. Before going to the details, we need to agree that the Cloud technology has given power to the customers to select any type of servers quickly and they can set the server up and running in a matter of minutes. This means, theoretically, you can put all the Talend services in single server for demonstrational purposes. But is it a right approach? Before making the final call, we will discuss about various factors surrounding this scenario.

 

Going back to Monolithic server approach

The world had already moved out of monolithic systems when I started my IT career (well, it doesn’t mean I am too old or too young ?). One of the tendencies we are seeing is that some users are trying to go back to same monolithic environment patterns by taking a different route. This time, the monolithic server concept is wrapped carefully with gift paper in the form of a single high computational server available in various Cloud environments. Customers often overlook the fact that those servers are available in Cloud for specific high computational use cases like graphics processing, scientific modeling, dedicated gaming servers, machine learning etc.

Talend services

 

From the Talend perspective, it’s always ideal to distribute various services (remember the Infinity Stones) to various servers as per recommended architecture.

 

Keeping all eggs in same basket problem

For argument’s sake, let’s put all the services of Talend for a Production environment in a single server. When a server failure occurs, this approach will bring down the entire Talend ecosystem.

 

This might be a costly affair especially if the enterprise is having specific data transfer SLAs between various systems. But if you are distributing the Talend environments in recommended architecture style, you can manage these types of failures in a graceful manner.

 

The battle for system resources

Still not convinced? Then I will take you to the next phase where the battle to grab system resources is happening. The basic scenario remains the same where you have installed all the services in single server.

 

Imagine you are going to pump a lot of data from source to target in batch format. We must also consider that multiple transformation rules need to be applied to the incoming data. This means your job server will be ready to grab a lot of available “treasure” (which is nothing more than system resources like CPU, memory, disk space etc). At the same time, you need to make sure that system resources are available for other services like TAC, Continuous Integration systems, Run time servers.

The tug-of-war for system resources will eventually lead to a big battle among various services. For example, let’s assume a case where TAC fails due to lack of memory. This means that you have lost all the control over the ecosystem and there is nobody to manage the services. If the victim is the run time server, it means your data flow through various web services and routes will start failing.

At the same time, if you start using a gigantic single server, you may not use the entire computational capacity all the time. It will be like gold plating all of your weapons and the result will be too much cost to maintain the underlying infrastructure.

 

Refreshing our minds with Talend Architecture

I hope, by now, all of you are convinced by the rationale of not keeping all the Infinity Stones Talend services in a single server. Before going further into detail on recommended architecture, let us quickly refresh our minds about various services involved in Talend Architecture. I will start with On-premise version. Below diagram will help you to understand the various services involved to handle both batch and streaming data. If you would like to understand more about each service, you can get the details from this link.

Talend Architecture

 

The Talend Cloud Architecture simplifies the overall landscape. You need to remember that you may still have to manage Remote Engines (either for Studio jobs or Pipeline Designer), Runtime Servers, Continuous Integration related activities etc.

Talend Architecture

 

If you would like to know more about Talend Cloud Architecture, I would recommend you have a look at the Cloud Data Integration User Guide.

 

Talend Recommended Architecture

The detailed description of Talend Recommended Architecture (including server sizing) for On-premise products can be referred from this link. I am not going to repeat the content, but I would like to show a high-level view of Talend recommended server layout for your quick reference.

Talend Recommended Architecture

 

The Cloud recommended architecture layout is much simpler since the Talend services are managed from Cloud environment. You can refer to the recommended Talend Cloud Architecture at this link. A quick peek of the server layout in the case of Talend Cloud is as shown below.

Talend Cloud Architecture

 

I hope this discussion on the “Infinity Stones” of Talend was as interesting for  you as it was for me ?. Until I come up with another clever analogy to write a blog around, enjoy your time using Talend and keep those “stones” safe!

 

The post Like the Infinity Stones, keep your Talend services as far apart as possible appeared first on Talend Real-Time Open Source Data Integration Software.

Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions

$
0
0

 “Every organization — no matter how big or how small — needs data quality,” says Gartner in its newly published Magic Quadrant for Data Quality Solutions. However, with more and more data coming from more and more sources, it’s increasingly harder for data professionals to transform the growing data chaos into trusted and valuable data assets. Data pipelines may carry incomplete and inaccurate data, making data practitioners’ jobs difficult and preventing data-driven initiatives from delivering on their expected business outcomes.

We all aspire to transform our organizations with data-driven insights, but we can’t do that if we don’t trust our data. A recent Opinion Matters survey shows that only 31% of data specialists have a high level of confidence in their organizations’ ability to deliver trusted data at speed.

Working without reliable data becomes costly, risky, and chaotic. Whether you’re unifying product and customer data in a single 360° view to transform the customer experience, or you need to comply with data privacy regulations, data quality can make the difference between the success and failure of your data-driven initiatives.

 

No data management initiative is complete without a solid data quality strategy

Data quality can dramatically impact your bottom line. Gartner stated that its “Magic Quadrant customer survey shows that organizations estimate the average cost of poor data quality at $12.9 million every year.” Another Gartner report also positions data governance and data quality as the most important initiatives for data management strategies.

As data quality is becoming a linchpin of data management, we’re proud that Talend was recognized by Gartner as a Leader for the third time in a row in the 2020 edition of Gartner’s Magic Quadrant for Data Quality Solutions. 

We believe data quality shouldn’t be managed by a standalone solution. Rather, data quality is a core discipline within data management. It should span out everywhere, and this requires integration and extensibility.  

Talend Data Fabric delivers data quality as a pervasive capability that spans across our platform and related applications, and that includes self-service data preparation, data integration, real-time integration, metadata management, and a data catalog.

We believe, being recognized as a Leader in the Magic Quadrant for Data Quality Solutions not only validates our capacity to build a vision for data quality, but also validates our ability to help organizations succeed in their digital transformation journeys.

2020 Gartner Magic Quadrant for Data Quality Solutions

Download a complimentary copy of the 2020 Magic Quadrant for Data Quality Solutions.

 

4 innovations that make the biggest impact on data quality

The research also considers the technologies and innovations in the data quality market. Let’s review those key ingredients and see how Talend addresses them.

Ubiquity: horizontal, not vertical data quality

Talend has made data quality a key component of its data management vision for a decade; we have been positioned in this Gartner Magic Quadrant since 2011. Talend has always considered data quality the key to making any data management project a success.

We embed data quality into every step of the data pipeline by making Talend Data Quality an integrated part of Talend Data Fabric instead of a standalone application, so that customers can get data they trust at every stage of the data lifecycle.

 

Simplicity: democratizing data quality with simple, efficient, collaborative data systems

Data practitioners need simple, intelligent, automated data quality tools to transform data chaos into valuable, reusable data assets.

Talend was among the first contenders to cover that need. Talend introduced self-service data preparation tools in 2016, bridging the gap between IT capabilities and business needs. The following year, Talend entered the Magic Quadrant for Data Quality Solutions as a Leader. Today, Talend Data Fabric provides a unified, collaborative platform in the cloud on which nontechnical users can profile, contribute, and improve data, removing the hassle of legacy on-premises systems.

 

Automation: data quality made intelligent

Amplifying data quality with machine learning has become a key differentiator. “By 2022,” Gartner predicts, “60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement.” Business users need help to accelerate preparation for better data.

To that end, Talend recently introduced more machine learning-driven features, such as Magic Fill to accelerate data preparation and let users process data quicker and better.

 

Collaboration: bring the people expertise back into the data

Still, while automation is important, it’s not the answer to everything. Data quality success often stems from the right alliance of people, technology, and processes aligning with each other to make an impact. People must remain in control, and human expertise must be captured and employed in the data chain. To capture that knowledge, another component of Talend Data Fabric, Talend Data Stewardship, helps organizations assign data validation to appointed experts across the organization and track and audit progress.

Talend’s stewardship capabilities were highlighted by our customers in the previous Magic Quadrant, and continue to provide value to customers. That’s why we made Talend Data Stewardship a key part of our Talend Data Fabric, letting organizations not only offer that functionality, but also engage users in a virtuous circle with their data.

 

Companies rely on data quality to deliver successful data strategies

We’re witnessing these innovations and new needs firsthand and are proud to support our customers on their journey to data quality.

Talend customer: Seacoast BankTake the example of Seacoast Bank, which created a data quality index for all their financial services. Seacoast Bank relies on data to be able to provide customers the best solutions for their needs, and to develop a deeper understanding of who their customers are and how they want to work with the bank. And being heavily regulated, Seacoast Bank also understands the need for trusted data. Seacoast Bank is banking on a data quality index to measure data quality across six dimensions and track how it improves or degrades as the bank acquires other banks, and as data sources, processes, and the technical environment change.

It’s our duty to make sure each customer’s data accurately reflects who they are in our community, and what their relationship is with our community-based bank.

Mark Blanchette
SVP, Director of Data Management and Business Technology, Seacoast Bank

 

 

Talend is working with a renowned telco operator that serves more than 90 million mobile subscribers. Our customer was facing huge data quality challenges that led to underperforming customer communications. They used Talend Data Quality to convert bad data into a steady stream of clean and reliable source data to power advanced analytics. This happens automatically every day, allowing data analysts, the operations team, and even business users to know if the data they are using is accurate and valid. Results were impressive: The company went from a 40% to a 90%+ trust score that saw better efficiency, cost reduction, risk protection, and higher ROI of marketing campaigns.

 

Everyone should know what’s inside their data, score it, and improve it over time

Gartner predicts that “by 2022, 70% of organizations will rigorously track data quality levels via metrics, increasing data quality by 60% to significantly reduce operational risks and costs.”

Talend brought data profiling into the hands of data engineers. Now that everyone wants to use data, it’s equally important to let data workers understand the data, endorse it, score it, and improve it.

Data Trust Score by Talend Talend Trust Score does just that. The Trust Score helps anyone to answer at a glance the question “How trustworthy is my dataset?” It’s based not only on data quality indicators, but also on popularity and certification, so that reliable and authoritative datasets can be shared and populated across the organization.

 

We’re still in the early stages of the data quality journey. Data management practices are constantly evolving, and we’re seeing capabilities converging into a unified platform that can meet the needs of both business departments and IT.

We’re happy to help. We thank all the customers who have placed their trust in Talend. And to anyone who wants to bring clarity to their data chaos, we invite you to discover Talend, try our data quality stack, and become part of our growing user community.

 

 

Gartner, Magic Quadrant for Data Quality Solutions, Melody Chien, Ankush Jain, 27 July 2020
Gartner, Survey Analysis: Data Management Struggles to Balance Innovation and Control, Melody Chien, Nick Heudecker, 19 March 2020
Gartner, Build a Data Quality Operating Model to Drive Data Quality Assurance, Melody Chien, Saul Judah, Ankush Jain, 29 January 2020 
Gartner, Magic Quadrant for Data Quality Solutions, “Melody Chien, Ankush Jain”, “27 July 2020”

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Talend.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Like the Infinity Stones, keep your Talend services as far apart as possible

$
0
0

Ok, so we all probably know why keeping all Infinity Stones in one place is a bad idea, right? You must now be wondering what the relationship between Infinity Stones and Talend could be. Worry not, Thanos isn’t coming and there is a reasonable explanation behind the MCU fandom references, I promise.

I like to use this analogy to emphasize how distributing your Talend services among different servers according to Talend recommended architecture guidelines is as important as keeping the Infinity stones scattered across the universe and away from the clutches of evil. Are you following me?

 

In this blog, we will explore some common customer questions which come up during Talend architecture discussions and our answers to those queries.

Most Common query during planning stage

One of the most common queries I used to get while interacting with customers:

What should be the right methodology to allocate various Talend services in our ecosystem? Shall I give you one gigantic server which can handle all our computational requirements?

After going through the blog, I am sure that you will find the answer to this query yourself. Before going to the details, we need to agree that the Cloud technology has given power to the customers to select any type of servers quickly and they can set the server up and running in a matter of minutes. This means, theoretically, you can put all the Talend services in single server for demonstrational purposes. But is it a right approach? Before making the final call, we will discuss about various factors surrounding this scenario.

 

Going back to Monolithic server approach

The world had already moved out of monolithic systems when I started my IT career (well, it doesn’t mean I am too old or too young ?). One of the tendencies we are seeing is that some users are trying to go back to same monolithic environment patterns by taking a different route. This time, the monolithic server concept is wrapped carefully with gift paper in the form of a single high computational server available in various Cloud environments. Customers often overlook the fact that those servers are available in Cloud for specific high computational use cases like graphics processing, scientific modeling, dedicated gaming servers, machine learning etc.

Talend services

 

From the Talend perspective, it’s always ideal to distribute various services (remember the Infinity Stones) to various servers as per recommended architecture.

 

Keeping all eggs in same basket problem

For argument’s sake, let’s put all the services of Talend for a Production environment in a single server. When a server failure occurs, this approach will bring down the entire Talend ecosystem.

 

This might be a costly affair especially if the enterprise is having specific data transfer SLAs between various systems. But if you are distributing the Talend environments in recommended architecture style, you can manage these types of failures in a graceful manner.

 

The battle for system resources

Still not convinced? Then I will take you to the next phase where the battle to grab system resources is happening. The basic scenario remains the same where you have installed all the services in single server.

 

Imagine you are going to pump a lot of data from source to target in batch format. We must also consider that multiple transformation rules need to be applied to the incoming data. This means your job server will be ready to grab a lot of available “treasure” (which is nothing more than system resources like CPU, memory, disk space etc). At the same time, you need to make sure that system resources are available for other services like TAC, Continuous Integration systems, Run time servers.

The tug-of-war for system resources will eventually lead to a big battle among various services. For example, let’s assume a case where TAC fails due to lack of memory. This means that you have lost all the control over the ecosystem and there is nobody to manage the services. If the victim is the run time server, it means your data flow through various web services and routes will start failing.

At the same time, if you start using a gigantic single server, you may not use the entire computational capacity all the time. It will be like gold plating all of your weapons and the result will be too much cost to maintain the underlying infrastructure.

 

Refreshing our minds with Talend Architecture

I hope, by now, all of you are convinced by the rationale of not keeping all the Infinity Stones Talend services in a single server. Before going further into detail on recommended architecture, let us quickly refresh our minds about various services involved in Talend Architecture. I will start with On-premise version. Below diagram will help you to understand the various services involved to handle both batch and streaming data. If you would like to understand more about each service, you can get the details from this link.

Talend Architecture

 

The Talend Cloud Architecture simplifies the overall landscape. You need to remember that you may still have to manage Remote Engines (either for Studio jobs or Pipeline Designer), Runtime Servers, Continuous Integration related activities etc.

Talend Architecture

 

If you would like to know more about Talend Cloud Architecture, I would recommend you have a look at the Cloud Data Integration User Guide.

 

Talend Recommended Architecture

The detailed description of Talend Recommended Architecture (including server sizing) for On-premise products can be referred from this link. I am not going to repeat the content, but I would like to show a high-level view of Talend recommended server layout for your quick reference.

Talend Recommended Architecture

 

The Cloud recommended architecture layout is much simpler since the Talend services are managed from Cloud environment. You can refer to the recommended Talend Cloud Architecture at this link. A quick peek of the server layout in the case of Talend Cloud is as shown below.

Talend Cloud Architecture

 

I hope this discussion on the “Infinity Stones” of Talend was as interesting for  you as it was for me ?. Until I come up with another clever analogy to write a blog around, enjoy your time using Talend and keep those “stones” safe!

 

The post Like the Infinity Stones, keep your Talend services as far apart as possible appeared first on Talend Real-Time Open Source Data Integration Software.


Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions

$
0
0

 “Every organization — no matter how big or how small — needs data quality,” says Gartner in its newly published Magic Quadrant for Data Quality Solutions. However, with more and more data coming from more and more sources, it’s increasingly harder for data professionals to transform the growing data chaos into trusted and valuable data assets. Data pipelines may carry incomplete and inaccurate data, making data practitioners’ jobs difficult and preventing data-driven initiatives from delivering on their expected business outcomes.

We all aspire to transform our organizations with data-driven insights, but we can’t do that if we don’t trust our data. A recent Opinion Matters survey shows that only 31% of data specialists have a high level of confidence in their organizations’ ability to deliver trusted data at speed.

Working without reliable data becomes costly, risky, and chaotic. Whether you’re unifying product and customer data in a single 360° view to transform the customer experience, or you need to comply with data privacy regulations, data quality can make the difference between the success and failure of your data-driven initiatives.

 

No data management initiative is complete without a solid data quality strategy

Data quality can dramatically impact your bottom line. Gartner stated that its “Magic Quadrant customer survey shows that organizations estimate the average cost of poor data quality at $12.9 million every year.” Another Gartner report also positions data governance and data quality as the most important initiatives for data management strategies.

As data quality is becoming a linchpin of data management, we’re proud that Talend was recognized by Gartner as a Leader for the third time in a row in the 2020 edition of Gartner’s Magic Quadrant for Data Quality Solutions. 

We believe data quality shouldn’t be managed by a standalone solution. Rather, data quality is a core discipline within data management. It should span out everywhere, and this requires integration and extensibility.  

Talend Data Fabric delivers data quality as a pervasive capability that spans across our platform and related applications, and that includes self-service data preparation, data integration, real-time integration, metadata management, and a data catalog.

We believe, being recognized as a Leader in the Magic Quadrant for Data Quality Solutions not only validates our capacity to build a vision for data quality, but also validates our ability to help organizations succeed in their digital transformation journeys.

2020 Gartner Magic Quadrant for Data Quality Solutions

Download a complimentary copy of the 2020 Magic Quadrant for Data Quality Solutions.

 

4 innovations that make the biggest impact on data quality

The research also considers the technologies and innovations in the data quality market. Let’s review those key ingredients and see how Talend addresses them.

Ubiquity: horizontal, not vertical data quality

Talend has made data quality a key component of its data management vision for a decade; we have been positioned in this Gartner Magic Quadrant since 2011. Talend has always considered data quality the key to making any data management project a success.

We embed data quality into every step of the data pipeline by making Talend Data Quality an integrated part of Talend Data Fabric instead of a standalone application, so that customers can get data they trust at every stage of the data lifecycle.

 

Simplicity: democratizing data quality with simple, efficient, collaborative data systems

Data practitioners need simple, intelligent, automated data quality tools to transform data chaos into valuable, reusable data assets.

Talend was among the first contenders to cover that need. Talend introduced self-service data preparation tools in 2016, bridging the gap between IT capabilities and business needs. The following year, Talend entered the Magic Quadrant for Data Quality Solutions as a Leader. Today, Talend Data Fabric provides a unified, collaborative platform in the cloud on which nontechnical users can profile, contribute, and improve data, removing the hassle of legacy on-premises systems.

 

Automation: data quality made intelligent

Amplifying data quality with machine learning has become a key differentiator. “By 2022,” Gartner predicts, “60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement.” Business users need help to accelerate preparation for better data.

To that end, Talend recently introduced more machine learning-driven features, such as Magic Fill to accelerate data preparation and let users process data quicker and better.

 

Collaboration: bring the people expertise back into the data

Still, while automation is important, it’s not the answer to everything. Data quality success often stems from the right alliance of people, technology, and processes aligning with each other to make an impact. People must remain in control, and human expertise must be captured and employed in the data chain. To capture that knowledge, another component of Talend Data Fabric, Talend Data Stewardship, helps organizations assign data validation to appointed experts across the organization and track and audit progress.

Talend’s stewardship capabilities were highlighted by our customers in the previous Magic Quadrant, and continue to provide value to customers. That’s why we made Talend Data Stewardship a key part of our Talend Data Fabric, letting organizations not only offer that functionality, but also engage users in a virtuous circle with their data.

 

Companies rely on data quality to deliver successful data strategies

We’re witnessing these innovations and new needs firsthand and are proud to support our customers on their journey to data quality.

Talend customer: Seacoast BankTake the example of Seacoast Bank, which created a data quality index for all their financial services. Seacoast Bank relies on data to be able to provide customers the best solutions for their needs, and to develop a deeper understanding of who their customers are and how they want to work with the bank. And being heavily regulated, Seacoast Bank also understands the need for trusted data. Seacoast Bank is banking on a data quality index to measure data quality across six dimensions and track how it improves or degrades as the bank acquires other banks, and as data sources, processes, and the technical environment change.

It’s our duty to make sure each customer’s data accurately reflects who they are in our community, and what their relationship is with our community-based bank.

Mark Blanchette
SVP, Director of Data Management and Business Technology, Seacoast Bank

 

 

Talend is working with a renowned telco operator that serves more than 90 million mobile subscribers. Our customer was facing huge data quality challenges that led to underperforming customer communications. They used Talend Data Quality to convert bad data into a steady stream of clean and reliable source data to power advanced analytics. This happens automatically every day, allowing data analysts, the operations team, and even business users to know if the data they are using is accurate and valid. Results were impressive: The company went from a 40% to a 90%+ trust score that saw better efficiency, cost reduction, risk protection, and higher ROI of marketing campaigns.

 

Everyone should know what’s inside their data, score it, and improve it over time

Gartner predicts that “by 2022, 70% of organizations will rigorously track data quality levels via metrics, increasing data quality by 60% to significantly reduce operational risks and costs.”

Talend brought data profiling into the hands of data engineers. Now that everyone wants to use data, it’s equally important to let data workers understand the data, endorse it, score it, and improve it.

Data Trust Score by Talend Talend Trust Score does just that. The Trust Score helps anyone to answer at a glance the question “How trustworthy is my dataset?” It’s based not only on data quality indicators, but also on popularity and certification, so that reliable and authoritative datasets can be shared and populated across the organization.

 

We’re still in the early stages of the data quality journey. Data management practices are constantly evolving, and we’re seeing capabilities converging into a unified platform that can meet the needs of both business departments and IT.

We’re happy to help. We thank all the customers who have placed their trust in Talend. And to anyone who wants to bring clarity to their data chaos, we invite you to discover Talend, try our data quality stack, and become part of our growing user community.

 

 

Gartner, Magic Quadrant for Data Quality Solutions, Melody Chien, Ankush Jain, 27 July 2020
Gartner, Survey Analysis: Data Management Struggles to Balance Innovation and Control, Melody Chien, Nick Heudecker, 19 March 2020
Gartner, Build a Data Quality Operating Model to Drive Data Quality Assurance, Melody Chien, Saul Judah, Ankush Jain, 29 January 2020 
Gartner, Magic Quadrant for Data Quality Solutions, “Melody Chien, Ankush Jain”, “27 July 2020”

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Talend.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Like the Infinity Stones, keep your Talend services as far apart as possible

$
0
0

Ok, so we all probably know why keeping all Infinity Stones in one place is a bad idea, right? You must now be wondering what the relationship between Infinity Stones and Talend could be. Worry not, Thanos isn’t coming and there is a reasonable explanation behind the MCU fandom references, I promise.

I like to use this analogy to emphasize how distributing your Talend services among different servers according to Talend recommended architecture guidelines is as important as keeping the Infinity stones scattered across the universe and away from the clutches of evil. Are you following me?

 

In this blog, we will explore some common customer questions which come up during Talend architecture discussions and our answers to those queries.

Most Common query during planning stage

One of the most common queries I used to get while interacting with customers:

What should be the right methodology to allocate various Talend services in our ecosystem? Shall I give you one gigantic server which can handle all our computational requirements?

After going through the blog, I am sure that you will find the answer to this query yourself. Before going to the details, we need to agree that the Cloud technology has given power to the customers to select any type of servers quickly and they can set the server up and running in a matter of minutes. This means, theoretically, you can put all the Talend services in single server for demonstrational purposes. But is it a right approach? Before making the final call, we will discuss about various factors surrounding this scenario.

 

Going back to Monolithic server approach

The world had already moved out of monolithic systems when I started my IT career (well, it doesn’t mean I am too old or too young ?). One of the tendencies we are seeing is that some users are trying to go back to same monolithic environment patterns by taking a different route. This time, the monolithic server concept is wrapped carefully with gift paper in the form of a single high computational server available in various Cloud environments. Customers often overlook the fact that those servers are available in Cloud for specific high computational use cases like graphics processing, scientific modeling, dedicated gaming servers, machine learning etc.

Talend services

 

From the Talend perspective, it’s always ideal to distribute various services (remember the Infinity Stones) to various servers as per recommended architecture.

 

Keeping all eggs in same basket problem

For argument’s sake, let’s put all the services of Talend for a Production environment in a single server. When a server failure occurs, this approach will bring down the entire Talend ecosystem.

 

This might be a costly affair especially if the enterprise is having specific data transfer SLAs between various systems. But if you are distributing the Talend environments in recommended architecture style, you can manage these types of failures in a graceful manner.

 

The battle for system resources

Still not convinced? Then I will take you to the next phase where the battle to grab system resources is happening. The basic scenario remains the same where you have installed all the services in single server.

 

Imagine you are going to pump a lot of data from source to target in batch format. We must also consider that multiple transformation rules need to be applied to the incoming data. This means your job server will be ready to grab a lot of available “treasure” (which is nothing more than system resources like CPU, memory, disk space etc). At the same time, you need to make sure that system resources are available for other services like TAC, Continuous Integration systems, Run time servers.

The tug-of-war for system resources will eventually lead to a big battle among various services. For example, let’s assume a case where TAC fails due to lack of memory. This means that you have lost all the control over the ecosystem and there is nobody to manage the services. If the victim is the run time server, it means your data flow through various web services and routes will start failing.

At the same time, if you start using a gigantic single server, you may not use the entire computational capacity all the time. It will be like gold plating all of your weapons and the result will be too much cost to maintain the underlying infrastructure.

 

Refreshing our minds with Talend Architecture

I hope, by now, all of you are convinced by the rationale of not keeping all the Infinity Stones Talend services in a single server. Before going further into detail on recommended architecture, let us quickly refresh our minds about various services involved in Talend Architecture. I will start with On-premise version. Below diagram will help you to understand the various services involved to handle both batch and streaming data. If you would like to understand more about each service, you can get the details from this link.

Talend Architecture

 

The Talend Cloud Architecture simplifies the overall landscape. You need to remember that you may still have to manage Remote Engines (either for Studio jobs or Pipeline Designer), Runtime Servers, Continuous Integration related activities etc.

Talend Architecture

 

If you would like to know more about Talend Cloud Architecture, I would recommend you have a look at the Cloud Data Integration User Guide.

 

Talend Recommended Architecture

The detailed description of Talend Recommended Architecture (including server sizing) for On-premise products can be referred from this link. I am not going to repeat the content, but I would like to show a high-level view of Talend recommended server layout for your quick reference.

Talend Recommended Architecture

 

The Cloud recommended architecture layout is much simpler since the Talend services are managed from Cloud environment. You can refer to the recommended Talend Cloud Architecture at this link. A quick peek of the server layout in the case of Talend Cloud is as shown below.

Talend Cloud Architecture

 

I hope this discussion on the “Infinity Stones” of Talend was as interesting for  you as it was for me ?. Until I come up with another clever analogy to write a blog around, enjoy your time using Talend and keep those “stones” safe!

 

The post Like the Infinity Stones, keep your Talend services as far apart as possible appeared first on Talend Real-Time Open Source Data Integration Software.

Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions

$
0
0

 “Every organization — no matter how big or how small — needs data quality,” says Gartner in its newly published Magic Quadrant for Data Quality Solutions. However, with more and more data coming from more and more sources, it’s increasingly harder for data professionals to transform the growing data chaos into trusted and valuable data assets. Data pipelines may carry incomplete and inaccurate data, making data practitioners’ jobs difficult and preventing data-driven initiatives from delivering on their expected business outcomes.

We all aspire to transform our organizations with data-driven insights, but we can’t do that if we don’t trust our data. A recent Opinion Matters survey shows that only 31% of data specialists have a high level of confidence in their organizations’ ability to deliver trusted data at speed.

Working without reliable data becomes costly, risky, and chaotic. Whether you’re unifying product and customer data in a single 360° view to transform the customer experience, or you need to comply with data privacy regulations, data quality can make the difference between the success and failure of your data-driven initiatives.

 

No data management initiative is complete without a solid data quality strategy

Data quality can dramatically impact your bottom line. Gartner stated that its “Magic Quadrant customer survey shows that organizations estimate the average cost of poor data quality at $12.9 million every year.” Another Gartner report also positions data governance and data quality as the most important initiatives for data management strategies.

As data quality is becoming a linchpin of data management, we’re proud that Talend was recognized by Gartner as a Leader for the third time in a row in the 2020 edition of Gartner’s Magic Quadrant for Data Quality Solutions. 

We believe data quality shouldn’t be managed by a standalone solution. Rather, data quality is a core discipline within data management. It should span out everywhere, and this requires integration and extensibility.  

Talend Data Fabric delivers data quality as a pervasive capability that spans across our platform and related applications, and that includes self-service data preparation, data integration, real-time integration, metadata management, and a data catalog.

We believe, being recognized as a Leader in the Magic Quadrant for Data Quality Solutions not only validates our capacity to build a vision for data quality, but also validates our ability to help organizations succeed in their digital transformation journeys.

2020 Gartner Magic Quadrant for Data Quality Solutions

Download a complimentary copy of the 2020 Magic Quadrant for Data Quality Solutions.

 

4 innovations that make the biggest impact on data quality

The research also considers the technologies and innovations in the data quality market. Let’s review those key ingredients and see how Talend addresses them.

Ubiquity: horizontal, not vertical data quality

Talend has made data quality a key component of its data management vision for a decade; we have been positioned in this Gartner Magic Quadrant since 2011. Talend has always considered data quality the key to making any data management project a success.

We embed data quality into every step of the data pipeline by making Talend Data Quality an integrated part of Talend Data Fabric instead of a standalone application, so that customers can get data they trust at every stage of the data lifecycle.

 

Simplicity: democratizing data quality with simple, efficient, collaborative data systems

Data practitioners need simple, intelligent, automated data quality tools to transform data chaos into valuable, reusable data assets.

Talend was among the first contenders to cover that need. Talend introduced self-service data preparation tools in 2016, bridging the gap between IT capabilities and business needs. The following year, Talend entered the Magic Quadrant for Data Quality Solutions as a Leader. Today, Talend Data Fabric provides a unified, collaborative platform in the cloud on which nontechnical users can profile, contribute, and improve data, removing the hassle of legacy on-premises systems.

 

Automation: data quality made intelligent

Amplifying data quality with machine learning has become a key differentiator. “By 2022,” Gartner predicts, “60% of organizations will leverage machine-learning-enabled data quality technology for suggestions to reduce manual tasks for data quality improvement.” Business users need help to accelerate preparation for better data.

To that end, Talend recently introduced more machine learning-driven features, such as Magic Fill to accelerate data preparation and let users process data quicker and better.

 

Collaboration: bring the people expertise back into the data

Still, while automation is important, it’s not the answer to everything. Data quality success often stems from the right alliance of people, technology, and processes aligning with each other to make an impact. People must remain in control, and human expertise must be captured and employed in the data chain. To capture that knowledge, another component of Talend Data Fabric, Talend Data Stewardship, helps organizations assign data validation to appointed experts across the organization and track and audit progress.

Talend’s stewardship capabilities were highlighted by our customers in the previous Magic Quadrant, and continue to provide value to customers. That’s why we made Talend Data Stewardship a key part of our Talend Data Fabric, letting organizations not only offer that functionality, but also engage users in a virtuous circle with their data.

 

Companies rely on data quality to deliver successful data strategies

We’re witnessing these innovations and new needs firsthand and are proud to support our customers on their journey to data quality.

Talend customer: Seacoast BankTake the example of Seacoast Bank, which created a data quality index for all their financial services. Seacoast Bank relies on data to be able to provide customers the best solutions for their needs, and to develop a deeper understanding of who their customers are and how they want to work with the bank. And being heavily regulated, Seacoast Bank also understands the need for trusted data. Seacoast Bank is banking on a data quality index to measure data quality across six dimensions and track how it improves or degrades as the bank acquires other banks, and as data sources, processes, and the technical environment change.

It’s our duty to make sure each customer’s data accurately reflects who they are in our community, and what their relationship is with our community-based bank.

Mark Blanchette
SVP, Director of Data Management and Business Technology, Seacoast Bank

 

 

Talend is working with a renowned telco operator that serves more than 90 million mobile subscribers. Our customer was facing huge data quality challenges that led to underperforming customer communications. They used Talend Data Quality to convert bad data into a steady stream of clean and reliable source data to power advanced analytics. This happens automatically every day, allowing data analysts, the operations team, and even business users to know if the data they are using is accurate and valid. Results were impressive: The company went from a 40% to a 90%+ trust score that saw better efficiency, cost reduction, risk protection, and higher ROI of marketing campaigns.

 

Everyone should know what’s inside their data, score it, and improve it over time

Gartner predicts that “by 2022, 70% of organizations will rigorously track data quality levels via metrics, increasing data quality by 60% to significantly reduce operational risks and costs.”

Talend brought data profiling into the hands of data engineers. Now that everyone wants to use data, it’s equally important to let data workers understand the data, endorse it, score it, and improve it.

Data Trust Score by Talend Talend Trust Score does just that. The Trust Score helps anyone to answer at a glance the question “How trustworthy is my dataset?” It’s based not only on data quality indicators, but also on popularity and certification, so that reliable and authoritative datasets can be shared and populated across the organization.

 

We’re still in the early stages of the data quality journey. Data management practices are constantly evolving, and we’re seeing capabilities converging into a unified platform that can meet the needs of both business departments and IT.

We’re happy to help. We thank all the customers who have placed their trust in Talend. And to anyone who wants to bring clarity to their data chaos, we invite you to discover Talend, try our data quality stack, and become part of our growing user community.

 

 

Gartner, Magic Quadrant for Data Quality Solutions, Melody Chien, Ankush Jain, 27 July 2020
Gartner, Survey Analysis: Data Management Struggles to Balance Innovation and Control, Melody Chien, Nick Heudecker, 19 March 2020
Gartner, Build a Data Quality Operating Model to Drive Data Quality Assurance, Melody Chien, Saul Judah, Ankush Jain, 29 January 2020 
Gartner, Magic Quadrant for Data Quality Solutions, “Melody Chien, Ankush Jain”, “27 July 2020”

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, express or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally, and is used herein with permission. All rights reserved.
This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from Talend.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Quality Solutions appeared first on Talend Real-Time Open Source Data Integration Software.

Talend Data Fabric August ’20 release: Expanding cloud capabilities to meet the needs of today’s data citizens

$
0
0

Talend is excited to announce the latest improvements to Talend Data Fabric, including expanded cloud capabilities, in our August ’20 release. Talend now offers Talend Cloud Data Catalog hosted on Amazon’s AWS platform. It provides the same great features and functionality as our on-premises solution but with no on-premises installation for a complete SaaS solution for data governance. This release also includes improvements to Talend Management Console and expanded capabilities for Talend Cloud on Azure.

Talend Data Catalog

 

Cloud Data Catalog

Finding, cataloging, and curating scattered and siloed data assets is a challenge for any aspiring data-driven organization. According to Gartner, “by 2021, organizations that offer a curated catalog of internal and external data to diverse users will realize twice the business value from their data and analytics investments than those that do not.” With that in mind, Talend introduces Talend Cloud Data Catalog. Liberating data value requires a tight combination of powerful metadata management capabilities with the agility of cloud infrastructures. While keeping the same powerful features as the on-premises version, Talend Cloud Data Catalog benefits from improved capabilities to democratize data trust:

Easy to deploy: Hosting a data catalog in the cloud enables data teams to accelerate the deployment of data solutions and meet the growing demands of organizations to control data as soon as they can.

Easy to update and maintain: Migration and versioning are often pain points for IT managers. By relying on modern cloud-based architectures, they can avoid complex and lengthy maintenance operations and frequent requests for vendor support. The Cloud Data Catalog benefits from automated updates and rollouts of new features. These updates run smoothly on AWS or Azure cloud infrastructure.

Easy to access: Distributed organizations often find access to remote systems to be complex and time-consuming, leaving those organizations with frustrated data workers who are unable to solve and control data issues. Talend Cloud Data Catalog removes the hassle of complex legacy systems by liberating data access for any type of data consumer.

 

Talend Cloud Data Catalog is the latest addition to Talend’s cloud-based, unified platform, which provides unmatched ease of use and requires little maintenance and no on-premises footprint on your in-house infrastructure. Talend Cloud Data Catalog is the perfect SaaS solution for your organization’s data governance challenges.

 

Talend Cloud on Azure

The cloud integration platform you choose should not dictate the cloud service you use. That’s why we’ve expanded our cloud offering on Microsoft Azure. We’ve added the full capabilities of Talend Data Stewardship, Data Inventory, Data Preparation, and API Services (coming in September), and expanded the capabilities of Talend Management Console. Stay tuned for more great things to come from Talend Cloud on Microsoft Azure.

 

Talend Management Console

We’ve improved Talend Management Console to equip your business with the latest cloud integration technology. Whether on-premises, in the cloud, or taking advantage of a hybrid implementation, Talend Management Console provides the necessary tools to manage your data integration teams and projects with ease.

Security: Many of the questions that companies ask when talking about a cloud integration platform deal with security; in today’s data-driven environment, customers depend on software providers to protect their personal information. We’ve added two notable security measures. IP allow lists let organizations restrict access to Talend Cloud to only trusted IP addresses. If an IP address is not on the allow list, it cannot access your Talend Cloud account. We also tightened security around the password type in context variables; now the password value is encrypted for better protection.

Audit: Regulatory compliance is something all organizations must address. Talend offers a unique Audit Logging API service that allows organizations to monitor activities on Talend Cloud applications. With this service, companies can ensure data security and manage regulatory compliance risks by performing advanced security analytics on audit logs that are easily collected and stored on premises. With further enhanced logging features, such as setting the logging level when creating or editing a task, your organization can have full control of traceability to maintain compliance.

Orchestration: A new public API v2.1 expands the public API capabilities in Talend Management Console. Users can list artifacts and orchestrate tasks and plan lifecycles through commands to update, delete, stop schedule, and list executions based on status or date. Whether you’re scheduling tasks directly in Talend Management Console or accessing the console through the new public API v2.1, Talend gives you the flexibility to manage schedules and tasks the way that works best for your organization. Further, Talend now allows routes and data service deployments on Remote Engine clusters. This enables organizations to build and deploy microservices in a highly available environment. For more detail, watch this video.

 

 

Other improvements

Connectivity: Talend continues to deliver connectivity to new data sources, including Azure Event Hubs, Google Cloud Storage, Microsoft Dynamics 365, and NetSuite. Further, a REST connector in the cloud allows you to send HTTP requests and receive responses. By setting a polling interval to consume a REST API, this new REST connector enables you to create a streaming pipeline.

Data Inventory: We’ve made Data Inventory more powerful and easier to use. With faceted search capabilities, dataset tagging, and search bookmarks, it’s now easier than ever to find the datasets you’re looking for and share them across your organization. Improvements to local file imports eliminate the need to fill in a dataset creation form. When you drag and drop a file into Data Inventory, the software automatically detects parameters, and the data immediately opens in the dataset overview. Also, we’ve added in-product chat to Data Inventory so users can get immediate answers to any issues they may have. You can directly contact a Talend agent through chat with the Talend Support widget, available with Gold support and above (some restrictions may apply).

 

 

Expand your cloud capabilities with Talend Data Fabric

Whether you’re a data integration specialist needing to innovate faster or a data officer needing to enforce compliance across your organization, Talend continues to provide the tools and insights you need to deliver results. The improvements in the Talend Data Fabric August ’20 release provide expanded cloud capabilities and enhance an already robust cloud platform. Talend Cloud Data Catalog delivers a complete data governance application to the cloud as a full SaaS solution with ease of use, automated maintenance, and no installation required. Talend Management Console provides continuous improvements through security and logging enhancements while simplifying orchestration. Finally, adding platform capabilities on Microsoft Azure ensures Talend users can build anything and deploy anywhere.

 

For a complete list of Talend’s latest improvements, visit help.talend.com and click on Release Notes.

For more information on how Talend can help your organization achieve data clarity, contact your local account representative now!

 

The post Talend Data Fabric August ’20 release: Expanding cloud capabilities to meet the needs of today’s data citizens appeared first on Talend Real-Time Open Source Data Integration Software.

Our reflections on the 2020 Gartner Magic Quadrant for Data Integration Tools

$
0
0

It’s no secret that today’s data environment is chaotic. Organizations everywhere have to deal with more sources of data, more data types, and more use cases for that data, in order to make mission-critical decisions with confidence.

But when we talk about data chaos, we’re talking about more than just the data itself. The growing number of integration solutions to bring all this data together has become just as confusing and complex. And while the cloud ushered in innovative ways of using data to drive business decisions and analysis, it also introduced new challenges and complexities. Many of our customers face this reality – customers like Siemens, which integrates hundreds of systems to power its data initiatives.  Integration solutions need to address these challenges in order to help organizations extract the maximum value from their data.

Data integration has become more than simply moving data. A best-of-breed solution requires a unified approach that provides data integration, but also a comprehensive set of additional capabilities, including data quality, data stewardship, and other data management functions to deliver not just data, but trusted data to everyone in the organization.  The right data integration solution must also work across on-premises, cloud, and hybrid environments to meet the needs of your data ecosystem.

 

Achieving data clarity with a holistic approach to integration

This complex reality is obvious when looking at the latest report from Gartner, the 2020 Gartner Magic Quadrant for Data Integration Tools: more data integration vendors were evaluated this year than ever before. Gartner observes that this market is “seeing renewed momentum driven by urgent requirements for hybrid/multicloud data management, augmented data integration and data fabric designs,” and predicts that “by 2023, organizations utilizing data fabrics to dynamically connect, optimize and automate data management processes will reduce time to integrated data delivery by 30%.”

That’s why we are very proud to be recognized as a Leader in the 2020 Gartner Magic Quadrant for Data Integration Tools for the fifth year in a row. For this report, Gartner evaluated Talend along with 19 data integration vendors across criteria such as ability to execute and completeness of vision.

Magic Quadrant for Data Integration tools

We believe this recognition validates our vision of taking a unified approach to data integration as part of Talend Data Fabric – an approach that is cloud-agnostic, offers broad connectivity, and supports hybrid infrastructures.

 

This holistic approach provides unique advantages to Talend Data Fabric’s integration capabilities. For instance, organizations get the flexibility to connect anything to anything without worrying about their data architecture.

Many of our customers have data on-premises and across multiple cloud environments, and they need an integration solution that has the flexibility to support those ecosystems without making fundamental changes to their underlying infrastructure. This flexibility means organizations can spend less time dealing with technical challenges and more time deriving insight and business value from their data.  Gartner also states, “Through 2025, over 80% organizations will use more than one cloud service provider (CSP) for their data and analytics use cases, making it critical for them to prioritize an independent and CSP-neutral integration technology to avoid vendor lock-ins.”

We believe this holistic approach also provides benefits beyond simply moving data to ensure businesses can make critical decisions with confidence. In addition to data integration, Talend Data Fabric offers broad capabilities including data quality everywhere that allows you fix your data and trust your data, application integration to enable organizations to share their data anywhere, data preparation, data cataloging, data stewardship, and API management, to name a few. These capabilities enable organizations to rapidly deliver complete, clean, uncompromised data to all the users who need it.

 

Better results through better data integration

We’re seeing customers take advantage of these unified capabilities to supercharge their data integration implementations and achieve some truly remarkable results.

Uniper, a global energy company operating in more than 40 countries, needed to provide its users with self-service data and analytics in real time. It used Talend to integrate more than 120 internal and external sources to power its data analytics platform. The organization also relies on Talend’s data governance and data cataloging capabilities – available within the same platform – to ensure that its data is trustworthy and rapidly available. Among other benefits, this has resulted in an 80% reduction in integration costs, a 75% increase in data integration speed, and a 50% gain in synergies and efficiencies.

“Disruption is today’s keyword for the power industry. So, information counts. With our new data analytics platform powered by Talend, we now can better understand where the market is going, which helps us optimize energy trading while managing risk and complying with regulations.”

René Greiner
Vice President for Data Integration, Uniper SE

 

The only unified solution for data integration

All organizations deal with the constant challenge of using more data from more sources to accomplish more for their business. We believe their data integration solution should bring order to that chaos rather than add to it.

We believe, Only Talend Data Fabric combines data integration, data integrity, and data governance in a single platform so that all aspects of working with data are simplified and unified. We are honored to be a strategic business partner to thousands of companies, helping them find clarity amidst the data chaos. And to those still searching for that clarity, we invite you to discover Talend, try our integration solution, and become part of our growing user community.

 

 

 

 

Gartner, Magic Quadrant for Data Integration Tools, Ehtisham Zaidi, Eric Thoo, Nick Heudecker, Sharat Menon, Robert Thanaraj, 18 August 2020.
Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post Our reflections on the 2020 Gartner Magic Quadrant for Data Integration Tools appeared first on Talend Real-Time Open Source Data Integration Software.

Viewing all 824 articles
Browse latest View live