Skip to main content

Google Cloud Dataflow

  • Cloud Dataflow is a fully managed data processing service for executing a wide variety of data processing patterns.

Features

  • Dataflow templates allow you to easily share your pipelines with team members and across your organization.
  • You can also take advantage of Google-provided templates to implement useful but simple data processing tasks.
  • Autoscaling lets the Dataflow automatically choose the appropriate number of worker instances required to run your job.
  • You can build a batch or streaming pipeline protected with customer-managed encryption key (CMEK) or access CMEK-protected data in sources and sinks.
  • Dataflow is integrated with VPC Service Controls to provide additional security on data processing environments by improving the ability to mitigate the risk of data exfiltration.

Pricing

  • Dataflow jobs are billed per second, based on the actual use of Dataflow batch or streaming workers. Additional resources, such as Cloud Storage or Pub/Sub, are each billed per that service’s pricing.

Comments

Popular posts from this blog

Google Cloud Pub/Sub

  Cloud Pub/Sub is a fully-managed real-time messaging service for event driven systems that allows you to send and receive messages between independent applications. Features Capable of global message routing to simplify multi-region systems. Synchronous, cross-zone message replication and per-message receipt tracking ensure at-least-once delivery at any scale. Pub/Sub delivers each message at least once, so the Pub/Sub service might redeliver messages. You can declare independent quota and billing for publishers and subscribers. Cloud Pub/Sub doesn’t have shards or partitions. You just need to set your quota, publish, and consume. Key Concepts Topic It is a named resource to which publishers send messages. Subscription Is a named resource representing the stream of messages from a specific topic, to be sent to the subscribing application. Message The combination of data and attributes that a publisher sends to a topic and is eventually sent to subscribers. Message attribute A key...

Google Cloud Dataprep

  Cloud Dataprep by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis, reporting, and machine learning. Features You can transform structured or unstructured datasets of any size — megabytes to petabytes — with equal ease and simplicity. Cloud Dataproc can transform datasets stored in CSV, JSON, or relational table formats. You can process data stored in Cloud Storage, BigQuery, or from your desktop, then export the refined data to BigQuery or Cloud Storage for storage, analysis, visualization, or machine learning. Uses a proprietary algorithm that interprets the data transformation intent of a user’s data selection. You can leverage hundreds of transformation functions readily available to turn your data into the asset you want. Cloud Dataprep enables users to collaborate on similar flow objects in real-time or to create copies for other team members to use for independent tasks. Explore your data ...

Google Cloud Identity and Access Management

  Create and manage permissions for your Google Cloud resources with Identity Access Management (IAM). Provides a unified view into your organization’s security policy with built-in auditing to ease compliance purposes. Features Lets you authorize who can take specific actions on resources to give you full control and visibility on your Google Cloud services centrally. Permissions are represented in the form of  service.resource.verb Can map job functions into groups and roles. With IAM, users only get access to what they need to get the job done. Cloud IAM enables you to grant access to cloud resources at fine-grained levels, well beyond project-level access. You can leverage Cloud Identity to easily create or sync user accounts across applications and projects. IAM lets you set policies at the following levels of the resource hierarchy: Organization level The organization resource represents your company. IAM roles granted at this level are inherited by all resources under t...