The F1 system supports three different ways of querying data:
1. Queries that affect only a few records of the OLTP type
2. Low-latency OLAP queries involving large amounts of data
3. Large-scale ETL Pileline
F1's paper does not provide an analysis of these three different data query methods. I combine the 2013 F1 paper and other background to analyze the reasons for the three different data query methods of F1 supporters.
THE OLTP TYPE OF QUERY ORIGINATED FROM F1'S ORIGINAL GOAL: TO REPLACE MYSQL CLUSTERS IN THE ADVERTISING BUSINESS. According to the 2013 F1 paper, its OLTP support is limited. An OLTP query in the F1 system is to read several operations followed by 0 to 1 write operation. The transactional processing power of the OLTP of the F1 system relies on the support of Spanner's underlying for the processing of things.
In the 2018 paper, the authors do not provide a detailed description of THE OLTP type of queries. However, according to common sense analysis, a stateless query engine that needs to support transactional processing cannot be separated from the underlying storage support for things. So the F1 engine obviously can't do transactional processing for any data source it connects to. Given that Spanner itself implements a data query engine, there is support for things processing. In this regard, F1 and Spanner have a clear competitive relationship.
Low latency and OLAP queries that involve large amounts of data are positioned much like BigQuery. Its implementation also has a BigQuery implementation, mainly through the pipeline way to query and return data results.
According to the analysis of this article, which presents itself and other competitors within Google, the business was moved to Bigquery or F1 after a Google system called Tenzing shut down in the early years. We can understand that BigQuery and F1 are competitors in this type of query. In practice, BigQuery is more successful.
In the early days, inside Google, large-scale ETL Pipeline was largely achieved by a series of MapReduce missions. With Flume, these businesses have moved to Flume. But Flume is a very bad system, and it takes a lot of code to make a simple data query. In this paper, the author makes explicit reference to F1's successful replacement of Flume in some businesses.
Combined with the above analysis, we can simply come to the next conclusion. The OLTP business in Google's internal F1 is primarily the target of F1's early years. F1 relies on Spanner's support for OLTP. Then Spanner himself developed a similar engine. This is not inconsistent with what I've heard about F1, which is primarily used by the advertising department, and the non-advertising department, which uses Spanner heavily.
In low-latency OLAP queries, the main competition for F1 is BigQuery. With BigQuery's success today. F1 should only have a business base in its home advertising department.
Flume is a mixed-up system within Google. Better than MapReduce, but not easy to use. F1 is a force in the ETL business and can capture a portion of the market. From a technical architecture point of view, how to achieve better use of ETL is the F1 team's 2018 paper more critical technology.