Viewing the F1 database of Google's advertising department from the VLDB paper

Click on the "above"Fly total chat IT”,Choose to follow the public No.

It takes about 20 minutes to read the text.


Recently, some of VLDB papers have been read because of their work. This includes an analysis of Google's newly published F1 database. It's always not easy to read Google papers. Because Google has always said half hidden half. This paper is relatively open to write, or can not avoid vulgarity.


This paper is a follow-up to Google's 2013 VLDB f1:A Distributed SQL Database That Scales, which provides a comprehensive picture of how Google's F1 database has evolved over the years. This paper is discussed in detail in this paper.


F1 and competitors' background

Let's review the history of F1. F1 is a data query system that supports multiple data sources. It was originally born in Google's advertising arm. The main purpose of the original purpose was to replace the mySQL cluster of the advertising system at that time. F1 has been positioned as a query engine from the beginning, implementing strict computational storage separation principles. The storage system underneath was The Next Generation Spanner of BigTable, which was then developed in parallel.


Then, in 2014, VLDB Google published Mesa, a global data warehouse system for multiple data centers. Mesa became the second system in F1's primary docking. F1 has evolved to the present day as a system that can support data federation queries for data from CSV files to BigTable to Spanner.


After so many years of development, Google has also formed a number of data processing systems. These database systems themselves have strong competitive relationships. In other words, I can grab a client from you, and my team will be bigger. F1, as a growing system within Google, is also the winner of this competitive relationship.


Understanding the history and service audiences of these databases is important for us to gain a deeper understanding of the business support and technical options of F1 systems. So below I am and understand F1 this paper related to some of Google's other database systems to do an introduction.


F1 was originally positioned to replace mySQL cluster son with Google's Ads division. Spanner, as the underlying system of F1, is a storage tier that supports transaction processing (implemented using 2 locking phase) and F1 exists as a computing engine.


But after the Spanner team itself developed the storage layer, it began to make data queries and developed an internal query system called Spandex. How Spanner evolved into a complete SQL system paper published in SIGMOD 2017. This led to a competitive relationship between F1 and Spanner. To this day, the rivalry between the two teams within Google remains fierce.


Dremel is an in-house data warehouse system for Google. Google has commercialized Dremel, named Big Query. Dremel uses a semi-structured data model with a column-based format, the first generation of which is ColumnIO.


The second generation format Capactior was introduced after commercialization. Both formats are external data sources supported by F1. Dremel's unusual success within Google. To date, BigQuery remains the most successful big data product on Google's cloud.


Flume is an upgrade to Google's internal MapReduce framework. Originally developed only on Java, it was originally called Flume Java, and later it was also available in C. Flume changed the development model for Map and Reduce written in the MapReduce framework, introducing more high-level APIs, which were developed like Spark.


In the underlying execution environment, Flume also changed MapReduce's rigid mode to support patterns such as Map-Reduce-Reduce. Its advantage is that it is very flexible to write a variety of data processing pipeline, the disadvantage is simple things also have to write a lot of code, not as simple as SQL.


F1's business positioning

The F1 system supports three different ways of querying data:

1. Queries that affect only a few records of the OLTP type

2. Low-latency OLAP queries involving large amounts of data

3. Large-scale ETL Pileline


F1's paper does not provide an analysis of these three different data query methods. I combine the 2013 F1 paper and other background to analyze the reasons for the three different data query methods of F1 supporters.


THE OLTP TYPE OF QUERY ORIGINATED FROM F1'S ORIGINAL GOAL: TO REPLACE MYSQL CLUSTERS IN THE ADVERTISING BUSINESS. According to the 2013 F1 paper, its OLTP support is limited. An OLTP query in the F1 system is to read several operations followed by 0 to 1 write operation. The transactional processing power of the OLTP of the F1 system relies on the support of Spanner's underlying for the processing of things.


In the 2018 paper, the authors do not provide a detailed description of THE OLTP type of queries. However, according to common sense analysis, a stateless query engine that needs to support transactional processing cannot be separated from the underlying storage support for things. So the F1 engine obviously can't do transactional processing for any data source it connects to. Given that Spanner itself implements a data query engine, there is support for things processing. In this regard, F1 and Spanner have a clear competitive relationship.


Low latency and OLAP queries that involve large amounts of data are positioned much like BigQuery. Its implementation also has a BigQuery implementation, mainly through the pipeline way to query and return data results.


According to the analysis of this article, which presents itself and other competitors within Google, the business was moved to Bigquery or F1 after a Google system called Tenzing shut down in the early years. We can understand that BigQuery and F1 are competitors in this type of query. In practice, BigQuery is more successful.


In the early days, inside Google, large-scale ETL Pipeline was largely achieved by a series of MapReduce missions. With Flume, these businesses have moved to Flume. But Flume is a very bad system, and it takes a lot of code to make a simple data query. In this paper, the author makes explicit reference to F1's successful replacement of Flume in some businesses.


Combined with the above analysis, we can simply come to the next conclusion. The OLTP business in Google's internal F1 is primarily the target of F1's early years. F1 relies on Spanner's support for OLTP. Then Spanner himself developed a similar engine. This is not inconsistent with what I've heard about F1, which is primarily used by the advertising department, and the non-advertising department, which uses Spanner heavily.


In low-latency OLAP queries, the main competition for F1 is BigQuery. With BigQuery's success today. F1 should only have a business base in its home advertising department.


Flume is a mixed-up system within Google. Better than MapReduce, but not easy to use. F1 is a force in the ETL business and can capture a portion of the market. From a technical architecture point of view, how to achieve better use of ETL is the F1 team's 2018 paper more critical technology.


F1's system architecture

The following image is the architecture diagram of the F1 system in the 2018 paper:

 

                            

The following image is the F1 system architecture diagram in the 2013 paper:



F1 systems can be deployed to different data centers, but each data center has a complete set of computing clusters. The cluster consists of 1 F1Master. It is an elected non-single-node service, unique to each data center. It is primarily to monitor the execution of queries and manage all F1Servers. The system consists of several F1 Servers that actually handle query requests.


There is also an F1 worker pool. When a query needs to be executed in parallel, these workers are used to execute a parallel query, and the corresponding F1 server becomes the code of the query. Worker is called Slave in the 2013 system architecture diagram. It's just a different name. The actual responsibilities of F1 Server were more clearly said in the 2013 paper.


The system also has a Catalog Service and a UDF Server. These things are relative to the addition of the system architects in the 2013 paper. The Catalog Service is a metadata service that defines the data in different data sources as appearance. We can see that in the system architecture of 2013, only Spanner, but in the 2018 paper, the data sources are diversified. So Catalog Service is a necessary service to become a multi-data source federal query engine in the development of F1.


UDF Server is a new thing F1 will reveal in its 2018 paper. Its main significance is to achieve the support for ETL and flume replacement. We'll cover it in more detail later.

F1's query mode

F1's query patterns can be easily divided into interactive and non-interactive. A combination of 2013 and 2018 papers. Interactive execution is primarily for queries that affect only a few records of the OLTP type and for Low Latency OLAP queries that involve large amounts of data. The system performs on both types of queries through F1 Server.


The execution plan is generated after F1 Server compiles and optimizes the query. There are two types of execution plans: single-threaded execution and parallel execution. The former is executed directly by Server. The latter server becomes the Entire Parallel query, executed by RPC call worker. This paper discusses some of the decisions on the system's partitioning strategy and how to improve system performance, mainly for data skew and non-access pattern. The practice is common in distributed databases. Interested people can read the paper. It's not going to start anymore.

The authors note that interactive execution is stable for about an hour or so it could fail. According to the paper, F1's distributed interaction execution does not have fault tolerance itself, but F1 client has a retry function. For a mature system, this is somewhat a pity.

Non-interactive execution is primarily used for long-time queries. It relies on Google's MapReduce framework. The query is compiled into a query plan and stored in RegistrQuery. Query Registry is a globally distributed Spanner database across data centers that tracks metadata for queries in all batch modes. There is also a global cross-data center Query Distributor service, which assigns query plans to a data center that executes the query using the MapReduce framework.

在MapReduce的查询框架里,F1的优化引入了Map-Reduce-Reduce的模式,这个和Map-Reduce的框架不符合。F1团队的解决方式是把这个翻译成Map-Reduce后跟一个Map<identity>-Reduce任务。这显然不是最高效的办法。由此可见,长查询通过MapReduce来执行并非最有效的方式。而F1也无法摆脱执行框架的限制。


F1's optimizer

The structure of the F1 optimizer is shown below. This is a more classic query optimization process. The optimizer gets the AST from the compiler as input, first converts it into a logical query plan, and then, after logical optimization, generates a physical query plan. This query plan is finally generated by the execution plan generator to produce an execution plan.

Logical optimization is mainly through the logic of relational algebra, the input logic query plan into a program that is optimal according to the heuristic, common optimizations such as predicate pushdown are performed here. A physical query plan is responsible for translating a logical plan into a physical plan. The most pending schedule generator segments the physical plan, with each segment becoming the last execution unit, and inserting exchange operators between execution units to repartition the data. The concurrency issue for each execution unit is also determined here.


The F1 optimizer as a whole is a relatively primitive optimizer. The entire optimizer is based entirely on rule, with no cost-base optimization. Compared with common data warehouse systems, this requires a lot of improvement.


Scalability of F1

F1 supports User-defined function (UDF), user-defined aggregate function (UDA) and table-value function (TVF). These are the usual extensions within the database system. These user-defined extensions can be implemented using SQL or LUA scripts. Basically, these implementations are the classic implementation in the database.


But what's more special in F1 is the introduction of uDF server son. It is mainly used to implement more complex TVF. A UDF server is a service that can be implemented in any language, and it provides F1 with a function interface for TVF. In addition to sending the corresponding input and receiving the results while running, these interfaces F1 provide additional information to the compiler and optimizer when querying compilation. For example, what is the output schema, whether TVF can be partitioned after each partition alone to perform and so on.


UDF server has very little ink in the article, but in my opinion this is the most important difference between the 2018 F1 paper and the 2013 paper. With UDF server, complex ETL logic is possible. UDF server also solves the database domain's old problem with UDF: resource management. If I'm going to pick one of the brightest things, I think it's UDF server.


I believe Google's F1 developers should be well aware of the importance of UDF server, but there is little more to write about in the paper. It is impossible to say that this may have been intentional.


The use of UDF server makes it possible for F1 to support complex ETL. At the same time, the standard data processing logic in ETL can be implemented directly by writing SQL. At the same time, because UDF server is a separate service, UDF common resource management problems have also been solved.


Summarize

The 2018 VLDB F1 paper on the architecture and development of Google's F1 database. F1 has now evolved into a data query engine that supports multiple data capabilities for multiple data sources. Its OLTP class queries are primarily focused on the initial task, replacing mySQL. Its low-latency OLAP queries mainly compete with Dremel. And its goal in supporting complex ETL is primarily aimed at Flume.


F1 has three execution modes: single-threaded, distributed interactive execution, and non-interactive execution based on MapReduce. It is a pity that distributed interactions perform without fail-recovering. The performance of the non-interactive execution based on MapReduce has room for further optimization.


F1 optimizer is a classic database optimizer, only the optimization of rule-base, not the optimization of cost-base. So I don't think an optimization like Join-reordering can do. This optimizer is quite simple and has a lot of room for improvement.


In terms of scalability, the extension methods UDF, UDA, and TVF are all classic database extensions. Its UDF server is a very important invention. I think of all the things in this article that have a great reference value. But this article apparently deliberately omits this piece.


F1's architecture is compared to 2013, with the addition of a metadata service, Catalog. Catalog plays an important role in the data lake scene. Both the discovery and sharing of data are essential. When it comes to rights management, the role of global metadata services is irreplaceable. Cost-base optimization also requires metadata-based services. It is a great pity that F1 has not mentioned the new additions to this 2018 paper.



Welcome to the long press chart to follow the subscription numberFly total chat ITTo watch more


Related articles:

Far-sightedness and near-worrying

How to know what a leader really thinks of you

Communication tcp handshake, equal to money and death

Leadership matters most

Don't be business illiterate

Sweeping a House and Sweeping the World

What's missing most from students to code farmers

Welcome to The Total Knowledge Planet.