Do it right - Very big table - fast queries

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Administration

Do it right - Very big table - fast queries

Author

Topic

Beer
Starting Member

7 Posts

Posted - 2011-08-16 : 09:01:38

Hello!

I am creating a table for storing measurements.
Table has 4 fields:
- ID
- Timestamp (not unique !!!)
- MeasurementID
- Value

I expect this table to become very large (30G+).
I am capturing data at max speed 20 packets/s (to buffer).
(I am writing captured data to DB on 1 second interval).

I will mostly run querys like this:
- Retrieve all data from X to Y for MeasurementID = Z
- Retrieve all data from X to Y for MeasurementID in [Z1, Z2, Z3,...]
X,Y = DateTime

How do I set indexes right?
I read about partitioning the table, but not sure if this is the right track?

Please help, advise,...

Thanks,
[Beer]

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2011-08-16 : 09:35:45

index on MeasurementID, Timestamp

SQL timestamp field <> a date/time field. it's a random hash code.
you may wish to use a formal date column - which implicitly includes a time portion.

if you wish to use partitions, then partitioning on MeasurementID is valid, but the normal way would be to partition on dates.

Beer
Starting Member

7 Posts

Posted - 2011-08-16 : 12:48:05

quote:
Originally posted by AndrewMurphy

index on MeasurementID, Timestamp

SQL timestamp field <> a date/time field. it's a random hash code.
you may wish to use a formal date column - which implicitly includes a time portion.

if you wish to use partitions, then partitioning on MeasurementID is valid, but the normal way would be to partition on dates.

Thanks Andrew.
I know about the timestamp (Type), but this is my column name (maybe not the best choice)

Would partitioning speed up queries?

I've created another index and set the columns as you suggested.
Here is what I have so far:

CREATE TABLE [dbo].[MeasurementTypes](
	[Id] [int] IDENTITY(1,1) NOT NULL,
	[Name] [nvarchar](50) NULL,
CONSTRAINT [PK_MeasurementTypes] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE TABLE [dbo].[Measurements](
	[ID] [int] IDENTITY(1,1) NOT NULL,
	[Timestamp] [datetime] NULL,
	[MeasurementID] [int] NULL,
	[Value] [real] NULL,
CONSTRAINT [PK_Measurements] PRIMARY KEY CLUSTERED 
(
	[ID] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]


CREATE NONCLUSTERED INDEX [IX_Measurements] ON [dbo].[Measurements] (
	[MeasurementID] ASC,
	[Timestamp] ASC
) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]

ALTER TABLE [dbo].[Measurements]  WITH CHECK ADD  CONSTRAINT [FK_Measurements_MeasurementTypes] FOREIGN KEY([MeasurementID])
REFERENCES [dbo].[MeasurementTypes] ([Id])

ALTER TABLE [dbo].[Measurements] CHECK CONSTRAINT [FK_Measurements_MeasurementTypes]

Kristen
Test

22859 Posts

Posted - 2011-08-17 : 19:48:13

"Would partitioning speed up queries?"

Yes, if your query only "covers" one/several of the total partitions available - provided that SQL can deduce that from the Query - i.e. the where clause has a clearly deducible range on the column(s) included in the partition check constraint.

Assuming that there will be "relatively few" rows in MeasurementTypes (i.e. Measurements will have lots of rows for the same values of MeasurementID) then your index "MeasurementID, Timestamp" may not be a good candidate as the first column will not be highly selective. Depends on the nature of your queries though. Like most things it would be best best to perform some timed performance tests on a reasonable qty of data and see a) how it performs and b) what query plan SQL actually uses

Recommend changing the name of your "timestamp" column to something that doesn't collide with a SQL reserve word

Lumbago
Norsk Yak Master

3271 Posts

Posted - 2011-08-18 : 07:33:52

Is the Measurments.MeasurementID in fact the same as MeasurementTypes.ID? In that case I'd name the MeasurementID as MeasurementTypeID instead?.

How many of these will you have? If MeasurementID are in the 10's or low 100's then partitioning on this column *could* be viable, but the better (and more likely) partitioning candidate is this timestamp-column. Partition based on hour, day, month, year...it all depends on your queries and inserts really.

- Lumbago
My blog-> http://thefirstsql.com/2011/07/08/how-to-find-gaps-in-identity-columns-at-the-speed-of-light/

Beer
Starting Member

7 Posts

Posted - 2011-08-19 : 01:48:16

Hello

Yes, you are both right.
- MeasurementTypes will contain few lines (10,20,... < 100)
(For example: Temp_1, Pressure_1, Velocity_1, Temp_2,...)
Basically. When new sensor is added to the system, I add new measurement type.

The whole idea is to design a database, so I can dynamically add new measurement.

I would appreciate if someone modifies my "source" and copy/paste it here.

Btw: Timestamp column name will be renamed (I can't remember better translation from Slovene to english).

Yes: MeasurementID column should be renamed to MeasurementTypeID
Thanks

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources