Skip to main content

Google Cloud Storage

 

  • An object storage service that stores data within buckets.
  • Below is a sample Cloud Storage integration:

Buckets

  • The data you upload on Cloud Storage are called objects.
  • An object is an immutable piece of data consisting of a file in any format.
  • You store objects inside containers called buckets.
  • All buckets belong to a project.
  • Each project can have multiple buckets.
  • You can also configure a Cloud Storage bucket to host a static website for a domain you own.

Bucket Configurations

  • Life Cycle Management
    • You can define conditions that trigger data deletion, or transition to a cheaper storage class with object life cycle management.
  • Versioning
    • Continue to store old copies of objects you store when they are deleted or overwritten.
  • Retention Policies
    • Define minimum retention periods that objects must be stored.
  • Object holds
    • Place a hold on an object to prevent deletion.
  • Encryption keys
    • Customer-managed
    • Customer-supplied
  • Access Permissions
    • Access Control List
    • Uniform bucket level access
    • Object and Bucket Level Permissions

Storage Classes

  • Standard Storage
    • Good for hot data that is accessed frequently.
  • Nearline Storage
    • Good for use cases that need to store objects for at least 30 days.
    • Ideal for data that you plan to access once per month or less.
  • Coldline Storage
    • Is a low-cost storage option for storing infrequently accessed data within 90 days.
  • Archive Storage
    • Is the coldest storage among the storage classes.
    • Designed for storing archive data and disaster recovery data that is expected to be accessed once per 365 days or less.

gsutil tool

  • A Python application that enables you to manage your Cloud Storage from the command line.
  • You can use gsutil to perform bucket and object management tasks like:
    • creating and deleting buckets
    • uploading, downloading, and deleting objects
    • listing buckets and objects
    • moving, copying, and renaming objects
    • editing object and bucket ACL
  • gsutil performs all operations using HTTPS and TLS

Uploading objects to GCS

You can send upload requests to Google Cloud Storage via the following methods:

  • Simple Upload – utilize this if the file is small enough to upload again if the connection fails, and if there is no object metadata to send as part of the upload request.
  • Multipart Upload – utilize this if the file is small enough to upload again if the connection fails, and you need to include object metadata as part of the upload request.
  • Resumable Upload – utilize this for a more reliable transfer, which is especially important with large files. 
  • Parallel composite uploads – utilize if network and disk speed are not limiting factors. When doing parallel composite upload, a file is divided into up to 32 chunks and uploaded in parallel to temporary objects. The final object is recreated using the temporary objects, and the temporary objects are deleted.
  • Alternatively, for uploading large volumes of data (from hundreds of terabytes up to 1 petabyte), you can utilize the Transfer Appliance. It is a hardware appliance you can use to securely migrate to Google Cloud Platform without disrupting business operations.

Pricing

  • Pricing for Cloud Storage services is based on what you use, including:
    • the amount of data you store,
    • the duration for which you store it,
    • the number of operations you perform on your data,
    • the network resources used when moving or accessing your data.
  • For “cold” storage classes meant to store long-term, infrequently accessed data, there are also charges for retrieving data and early deletion of data.
  • You can require accessors of your data to include a project ID to bill for network charges, operation charges, and retrieval fees.

Comments

Popular posts from this blog

Google Cloud Pub/Sub

  Cloud Pub/Sub is a fully-managed real-time messaging service for event driven systems that allows you to send and receive messages between independent applications. Features Capable of global message routing to simplify multi-region systems. Synchronous, cross-zone message replication and per-message receipt tracking ensure at-least-once delivery at any scale. Pub/Sub delivers each message at least once, so the Pub/Sub service might redeliver messages. You can declare independent quota and billing for publishers and subscribers. Cloud Pub/Sub doesn’t have shards or partitions. You just need to set your quota, publish, and consume. Key Concepts Topic It is a named resource to which publishers send messages. Subscription Is a named resource representing the stream of messages from a specific topic, to be sent to the subscribing application. Message The combination of data and attributes that a publisher sends to a topic and is eventually sent to subscribers. Message attribute A key...

Google Cloud Dataprep

  Cloud Dataprep by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis, reporting, and machine learning. Features You can transform structured or unstructured datasets of any size — megabytes to petabytes — with equal ease and simplicity. Cloud Dataproc can transform datasets stored in CSV, JSON, or relational table formats. You can process data stored in Cloud Storage, BigQuery, or from your desktop, then export the refined data to BigQuery or Cloud Storage for storage, analysis, visualization, or machine learning. Uses a proprietary algorithm that interprets the data transformation intent of a user’s data selection. You can leverage hundreds of transformation functions readily available to turn your data into the asset you want. Cloud Dataprep enables users to collaborate on similar flow objects in real-time or to create copies for other team members to use for independent tasks. Explore your data ...

Google Cloud Identity and Access Management

  Create and manage permissions for your Google Cloud resources with Identity Access Management (IAM). Provides a unified view into your organization’s security policy with built-in auditing to ease compliance purposes. Features Lets you authorize who can take specific actions on resources to give you full control and visibility on your Google Cloud services centrally. Permissions are represented in the form of  service.resource.verb Can map job functions into groups and roles. With IAM, users only get access to what they need to get the job done. Cloud IAM enables you to grant access to cloud resources at fine-grained levels, well beyond project-level access. You can leverage Cloud Identity to easily create or sync user accounts across applications and projects. IAM lets you set policies at the following levels of the resource hierarchy: Organization level The organization resource represents your company. IAM roles granted at this level are inherited by all resources under t...