Oracle事实表创建和加载策略

askTom 2018-10-11

391

问题描述

提前感谢您的支持。我有几个问题:

1.我将不得不将7年的历史数据与1280亿行加载到一个新的事实表中，该表还将具有插入以及最近90天的更新 (98% 插入和2% 每天更新)。我们应该用什么？间隔或范围划分 (日粒度)？

2.我可以使用所有只读数据创建历史事实，并将当前年份数据保留在另一个事实中吗？
表的可能列有日期、txn id、quality_id、间隔id、仪表号、读取值、创建日期、修改日期。

3.在这么大的表上，什么是最好的索引策略？我们将需要一个主键以及额外的索引来快速检索至少60天的数据。

主键将是日期、trans id、间隔id、仪表号。我们可能还需要有关仪表编号，日期仪表编号以及修改日期的索引。

4.我们有一个很短的时间来加载上述1280亿行。数据来自3个不同的Oracle数据库 (5年数据需要unpivot功能，剩余2年使用一些转换和处理。我们目前正在使用Informatica。什么是加载数据的最佳策略？

专家解答

1. I will have to load 7 years worth of history data with 128 billion rows into a new Fact Table, which also will have inserts as well as updates for the last 90 days (98% inserts and 2% updates on daily basis). What should we use? Interval or range partitioning (day granularity)?

间隔似乎是一个显而易见的选择，因为它为您提供了范围的所有好处，而没有任何缺点。粒度确实符合您的业务要求。7x365天开始有很多分区-但是您可以考虑使用混合模型。请参阅我关于该概念的博客文章

https://connor-mcdonald.com/2018/07/25/hyper-partitioned-index-avoidance-thingamajig/

2. Can I create a History Fact with all the read only data and keep the current year data in another fact?

我认为你不需要这么做。单个表可以混合使用只读和读写数据，您可以在分区级别进行控制。所以第二张表似乎没有必要。

3.What would be best index strategy on such a huge table? We will need a primary key as well as additional indexes for quick retrieval for at least 60 days of data.

通常，如果要快速加载数据，索引越少越好。您在主键中有日期，因此您可能会将其作为本地索引，这对于分区维护操作是有益的。您可能会考虑按米数进行子分区，这 * 可能 * 消除了与该列关联的显式索引的需要。

4. We have a very short window of time to load the above 128 billion rows. Data comes from 3 different Oracle databases (5 years data need unpivot function and remaining 2 years use some transformation and processing. We are currently using Informatica. What would be best strategy to load data?

Exchange partition可能是您在这里的朋友，因为这意味着您可以

-将数据加载到空的未索引表中 (非常快)
-然后索引它
-然后将其交换到真实事实表中。

这比加载到已经索引的表中要高效得多。

oracle asktom

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

Oracle事实表创建和加载策略

问题描述

专家解答

评论