Learn Teradata: Interview Questions

First SET Questions:
Question: Write a query to remove duplicates from a table?

Question: Why Fload doesn’t support multiset table?

Question: What is the process to restart the multiload in application phase if it fails?

Question: What is the process to restart the multiload in acquisation phase if it fails?

Question: Can we load a Multi set table using MLOAD?

Question: can I use “drop” statement in the utility “fload”?

Question: What are the types of tables in Teradata

Question: can we have an unconnected lkp to lookup a DB2 record against a Teradata record?

Question: What is the diffrence between Multiload & Fastload interms of Performance?

Question: How does indexing improve query performance?

Question: How can we build data marts from FSLDM. is there any standard approach for this?Is FSLDM supports place ?

Question: Does SDLC changes when you use Teradata instead of Oracle?

Question: What is a common data source for the central enterprise data warehouse?

Question: What are two examples of an OLTP environment?

Question: In which phase of the Active Data Warehouse evolution do you use data to determine what will happen?

Question: How many error tables are there in fload and what are their significance/use?

Question: Can we see the data of error tables?

Question: How many error tables are their in mload and what is there use?

Question: When mload job fails, can we access mload tables? If yes then how?

Question: After creating tables dynamically in the Teardata, where is the GRANT table option usually done ? When tables are newly created, what is the default role and what the default privileges which get assigned ?

Question: what is explain in teradata?

Question: How many codd's rules are satisfied by teradata database?

Question: What is cliques? What is Vdisk and how it will communicate with physical data storage at the time of data retrieval through AMP ?

Question: In which phase of the Active Data Warehouse evolution do you use data to determine what will happen?

Question: Which two can be implemented as a centrally located logical architecture? è operational data store logical data mart

Question: What is use of compress in terdata?Explain?

Question: What is FILLER command in Teradata?

Question: How a Referential integrity is handled in Teradata?

Question: Why AMP & PE are called Vprocs?

========================

SecondSET Questions:

Q. characteristics of PP2

Q. characteristics of clique

Q. what locks are acquired for an update statement

Q. order of processing outer join (matched rows, unmatched-null, where clause)

Q. Why to use macros

Q. characteristics of 3rd normal form.

Q. which is inner and outer table in a 3 table join.

Q. which stage of db design do we take advantage of td parallelism (PDM).

Q. what is incorporated in ELDM.

Q. what type of joins can we write in queries (inner, outer, cross)

Q. redistribution in merge join.

Q. use of statistics wizard, index wizard

Q. what sql statements are valid in a trigger.

Q. characteristics of Tpump, multiload.

Q. what is partial covering index join

Q. what is rowid join (special form of nested join)

Q. what is bsms nusi bitmapping?

Q. what 2 utilities provide record base report exporting

=====================

Third SET Questions:

• Steps to improve performance of the query?
• Explain Primary Index and what would be the constrains to select PI?
• What is skew factor?
• What is difference between Role, Priveledge and profile?
• What are different spaces in Teradata and difference?
• If your Skew factor is going up. What are remedies?
• When, How and why we use Secondry Indexes?
• What is difference between Primary Key and Primary Index?
• What is difference between database and user in Teradata. what are the things you can do or can not do in both?
• What is Checkpoint?
• When do you use BTEQ. What other softwares have you used or can we use rather than BTEQ?
• How many type of files have you loaded and their differences. (Fixed and Variable)?
• How do you execute your jobs in Teradata Environment?
• What was the environment of your latest project (Number of Amps, Nodes, Teradata Server Number etc)?
• What is difference between Multiload, FastLoad and TPUMP?
• What is the process to restart the multiload if it fails?
• How does indexing improve query performance?
• what are the different functions you do in BTEQ (Errorcode, ErrorLevel, etc)?
• What are different types of joins?
• What does COALESCE do in Teradata?
• What is difference between ZEROIFNULL and NULLIFZERO?
• What is Range_N?
• Explain PPI
• What is Casting in Teradata?
• What is difference between UNION and MINUS?
• What are the types of tables in Teradata?
• Can we have an unconnected lkp to lookup a DB2 record against a Teradata record?
• What is the diffrence between Multiload & Fastload interms of Performance?
• How does indexing improve query performance?
• How can we build data marts from FSLDM. is there any standard approach for this?Is FSLDM supports place ?
• Does SDLC changes when you use Teradata instead of Oracle?
• What is a common data source for the central enterprise data warehouse?
• What are two examples of an OLTP environment?
• In which phase of the Active Data Warehouse evolution do you use data to determine what will happen?
• what is the difference between start schma and Fsldm?

Fourth SET Questions:

===================

1. Explain about Project life cycle?

2. Diff between row number and rank?

3. how can you do performance tunning?

4. what is the extra commands you use in perfomence tunning?

5. Can we do performance tunning columns wise?then what is the syntax?

6. which architecture component use for performance tunning?

7. what are steps you followed after using expain command?

8. which join strategy is best and why?

9. in which situation we use Fexport?

10 .what is the command we use to get limited number for records in Fexport?

11. diff bet volatile and global temp table?

12. you have a table with Eid,Emax and you want to display the max sal with followed by Eid?

13. what is the syntax for single join index?

13. what is the query for creating table to new table with data?

14. what is clique?

15. what is materialized view?

16. Table contains million num of records you want to changes the M records to N num of records what is the query?

17. what is the query for displaying the full name which contains two columns like first name and last name?

13. Diff between distinct and count?

14. Explain the data distribution when we apply the index into the table?

15. A table contain the PI for one column do Multiload load the duplicate records load the data?

16. A SET table contain the PI for one column do Multiload load the duplicate records load the data?

17. A table contains millions of records having different company employees what is the query to display all names which contains only ibm employees in this paternname@company.com

18. A table contain the nulls replaced into user define names?

20. what is disadvantages of multiload?

21. If you got an error while load the data in Bteq how can you skip and continue the script?

22. what is Work table?

23. in ET what are the data displays?

24. what is SCD?Explain the Types?

25. what is confirm table when we use it?

26. what is the Fact on factless fact table?

27. Do you have a knowledge in datamodling?

28. have you worked on designing?

29. what is hash index?

30. what is diff between read and readlock and access lock?

===========================================

First SET Answers :

Question: Write a query to remove duplicates from a table?
Method : 1
Create table tmp as
( select * from table
qualify row_number() over (partition by pk_table order by pk_table) = 1
) with data;
Delete from table all;
Insert into table select * from tmp;
Drop table tmp;
------------
Method : 2
If we are having huge data in a file then simply load in temp table by using Fastload(only file to table) where it wont allow duplicates.
------------
Method : 3
Load in a SET table and rename it into main table
-----------
Method : 4
Delete DML wont allow order by function in select query and also if you sub query it could be chance of deleting entire rows, so we can just read the duplicate record with the below query for deleting we need to follow above three methods.

sel * from table a
qualify row_Number() over (partition by col1 col2 col3....coln order by col1 col2 col3....coln) > 1

Question: What is FILLER command in Teradata?
Answer
# 1 while using the mload of fastload if you don;t want to load
a particular filed in the datafile to the target then use
this filler command to achieve this

Question: How a Referential integrity is handled in Teradata?
Answer
# 1 TPUMP

Question: Explain about Skew Factor?
Answer
# 2 The data distribution of table among AMPs is called Skew Factor

Generally For Non-Unique PI we get duplicate values so the
more duplicate vales we get more the data have same rowhash
so all the same data will come to same amp, it makes data
distribution inequality,

One amp will store more data and other amp stores less
amount of data, when we are accessing full table, The amp
which is having more data will take longer time and makes
other amps waiting which leads processing wastage

In this situation (unequal distribution of data)we get Skew
Factor High

For this type of tables we should avoid full table scans

ex:
AMP0    AMP1
10000(10%)   9000000(90%)

in this situation skew factor is very high 90%

Question: Why AMP & PE are called Vprocs?
Answer
# 1 Vprocs:Virtual process
From PE to AMP (This is the network root via MSP(messege
passing layer),The processing data will store in Disks
(These are Physical disks),Each Amp have too many P.disks,
to migrate these P.disks The Bynet Network maintains
Virtual disks.Thses V.disks will responsible for data
migration.
That y we called this process is Virtual Process(VPROCS).

I think it will help u,,,,,

Question: Why Fload doesn’t support multiset table?
Answer
# 1 Fast Load will load the Data Fastly,
   Mean Concurrently,Bulk Insert.
   Internally TeraData server will Create
   Bulk insert DML statament
   (Like: insert into Table1 as sel * from Table2
           with data)
    For Fast Performence it will not support the
     duplicate values,Refential Intigrity,,,,

Question: What is use of compress in terdata?Explain?
Answer
# 1 Compression is used to Minimize the table size, for example
when the table size is increasing anonymously We can use
Compression to reduce the size of the table

Conditions:
1.Compression can be declared at the time of table creation
2.We can compress up to 256 column values(not columns)
3.We can't compress variable length fields
(vartext,varchar..)

Explanation for 2 condition:

create table tab1
(
:
:
Order_type char(25) compress ('air','sea','road')
:
)
in the above example order type have 3 fields, one should
be selected by the user, so one of the field will repeat
for every order, like these column values we can use
compress statement because these are repeating for entire
table, like these column values TD supports 256 col

generally NCR people will provides ready made scripts for
these type of compressions


   Question: What is use of compress in terdata?Explain?
Answer
# 2 Part of the answer by Sat!sh is correct.

However, we can store only one value per column and not 3
(air, sea, road). The compressed value is stored in column
header and is to be used as default of that column unless a
value is present.

e.g.
Dept of Taxes in Washington has a database that tracks all
people working in Washington. Around 99.9% of the tax payers
would have Washington as their state code on their address.
Instead of storing “Washington” in millions of records the
compress will store the value “Washington” in the table
header. Now, a value can be assumed in a row as a default
unless another value exists inside the column.

Question: What is the process to restart the multiload if it fails?
Answer
# 1 MULTILOAD will creates 2 error tables, 1 work table

When MULTILOAD fails We have to unlock the Main Table, here
error tables and work tables are not locked like FLOAD

To Unlock the Main Table in Mload

RELEASE MLOAD <TABLE NAME>;


   Question: What is the process to restart the multiload if it fails?
Answer
# 2 The above ans is only when the MLOAD failed in acquisation phase

To release lock in application phase failure in MLOAD

RELEASE MLOAD <Table Name> .IN APPLY;

Question: Can we load a Multi set table using MLOAD?
Answer
# 1 YES,

We can Load SET, MULTISET tables using Mload, But here when
loading into MULTISET table using MLOAD duplicate rows will
not be rejected, we have to take care of them before
loading.

But incase of Fload when we are loading into MULTISET
duplicate rows are automatically rejected, FLOAD will not
load duplicate rows weather table is SET or MULTISET

Question: can I use “drop” statement in the utility “fload”?
Answer
# 1 YES,

But you have to declare it out of the FLOAD Block it means
it should not come between .begin loading,.end loading

FLOAD also supports DELETE,CREATE,DROP statements which we
have to declare out of FLOAD block

in the FLOAD Block we can give only INSERT

Question: IS IT POSSIBLE THAT THERE R TWO PRIMARY KEY WILL BE IN A SAME TABLE?PLZ HELP ME BY GIVING THE ANSWER.THANK U.
Answer
# 1 Absolutly No. One table can not have more than one PK. But
u can have 2 cols as 1 PK called Composite Primary Key.

Question: IS IT POSSIBLE THAT THERE R TWO PRIMARY KEY WILL BE IN A SAME TABLE?PLZ HELP ME BY GIVING THE ANSWER.THANK U.
Answer
# 3 Primary key
============
1. A table should have only one primary key
2. More than one column can consist of a primary key – upto
64 columns
3. Can not be NULL values (missing values)
4. Should be unique values (no duplicate value)

Foreign key
============
1. A table may have zero or more than that upto 32 keys
2. More than one column can consist a primary key – upto 64
columns
3. Can have NULL values
4. Can have duplicate values
5. There should be a corresponding primary key in the
parent table to enforce referential integrity for a foreign
key

Question: IN A TABLE CAN WE USE PRIMARY KEY IN ONE COLUMN AND IN ANOTHER COLUMN BOTH UNIQUE AND NOT NULL CONSTRAINS.IF YES HOW?PLZ GIVE ME A EXAMPLE.
Answer
# 2 Yes, you can have a column for Primary key and have another
column which will have no duplicates or null.

e.g.

A Salary Table will have employee ID as primary key. The
table also contains TAX-ID which can not be null or duplicate.

Question: What are the types of tables in Teradata
Answer
# 5 1-- SET TABLE
2-- MULTISET TABLE
3-- GLOBAL TEMPORARY TABLE
4-- VOLATILE TABLE
5-- DERIVED TABLE

Question: can we have an unconnected lkp to lookup a DB2 record against a Teradata record?
Answer
# 1 Yes,,
We can use like Like,(Diff Databases)

Question: What is the diffrence between Multiload & Fastload interms of Performance?
Answer
# 1 Fastload is used to load empty tables and is very fast, can
load one table at a time.

Multiload can load at max 5 tbls at a time and can also
update and delete the data.

Fastload can be used only for inserting data, not updating
and deleting.


   Question: What is the diffrence between Multiload & Fastload interms of Performance?
Answer
# 2 Multiload can at max 5 tables with non unique secondary
indexes on them. where as in fastload u cannot have
secondary indexes on the table.

In multiload you can insert,update or delete data in
already populated tables whereas in fastload the target
table should be empty.

Multiload works in 5 phases whereas fastload works in two
phases.

If we want to load data into an empty table then fastload
is best option as compared to multiload.


   Question: What is the diffrence between Multiload & Fastload interms of Performance?
Answer
# 3 multiset table will allow duplication eventhough if you
load it into fastload it wont allow duplication

Question: How does indexing improve query performance?
Answer
# 1 By creating index on single or multiple columns, query
takes index access path instead of full table scan. Full
table scan is costilier operation where it has to visit all
rows in the table whereas index operation just accesses the
index sub table with required hash value and joins with the
basetable using rowid. This improves response time and
performance of the query.


   Question: How does indexing improve query performance?
Answer
# 2 Teradata automatically Creates the Primary index , if not
specified by the DDl for Table .Some of the Other Types of
indexes are
Secondary Index
Partitioned Primary Index
Join Index

Primary Index :- the PI is the way , Teradata Distributes
the Data , Accesses the Data to/from Amps. By Using the
Primary Index with Value or Range of Values in the Where
Clause of the Query , the operation Becomes a Single AMP/
Few amp aperation instead of all amp operation . the Access
path is
Rowhash of the Pi is known --> Amp of the Column along
with rowid of the Column . (1 Amp Ops)

Results
1. Highly Optimised Query Retrival
2. less Cost per Retrival

Secondary Index / Join Indexes :-

Creates a Secondary Index / Joinindex / Hashindex
Subtable , which maps the Column/s of the SI and the PI
Hash /Join(Frequently used Columns in two tables in this
Case ) , The operations , if using these Indexes are
basically 2 or more amp operation .

Results
1. Quicker Query Retrival time than a full table scan .
2. costs less on the Resources than a FTS
3. the Performance is improved if the JI is used by the
Optimizer

cons --

1. Additional Space Req. for the Subtables
2. need to be droped and recreated for Loads to happen
(Multi Load , FastLoad)ect , which is a real PAIN !!!
3.still be a 2 amp operation when compared to PI Retrival
for a SI.

PPI :-

Partitioned Primary Indexes are Created so as to divide the
table onto partitions based on Range or Values as Required .

the data is first Hashed into Amps , then Stored in amps
based on the Partitions !!! which when Retrived for a
single partition / multiple Partitions , will be a all amps
Scan, but not a Full Table Scan !!!! . this is effective
for Larger Tables partitioned on the Date Specially !!!
there is no extra Overhead on the System (no Spl Tables
Created ect )

Question: How can we build data marts from FSLDM. is there any standard approach for this?Is FSLDM supports place ?
Answer
# 1 In teradata, we can create another database which will
contain all the views for a given subject area. These views
togeather will form a logical datamart for that set of
business users.


   Question: How can we build data marts from FSLDM. is there any standard approach for this?Is FSLDM supports place ?
Answer
# 2 Teradata is like all other DBMS, we can create as many
database as required. Data Mart are basically subject
oriented and mainly based on business KPIs which end-users
generally like to measure. You have to refer business
requirement document for designing data mart.


Question: Does SDLC changes when you use Teradata instead of Oracle?
Answer
# 1 SDLC doesn't change but physical data model changes.
Physical data model(i.e table structure) needs to be re-
designed according to the teradata database.

Question: What is a common data source for the central enterprise data warehouse?
Answer
# 1 operational data stores (ODS)

Question: What are two examples of an OLTP environment?
Answer
# 1 On Line Banking
On Line Reservation(Transportation like Rail, Air etc.)

Question: In which phase of the Active Data Warehouse evolution do you use data to determine what will happen?
Answer
# 1 In predicting phase.

Analysts utilize the system to leverage information to
predict what will happen next in the business to proactively
manage the organization's strategy.


Question: what is the difference between start schma and Fsldm?
Answer
# 1 A star schema has one Fact tables and many dimensional
tables to store the related.

FSLDM (Financial services logical data model) is a model
proposed and designed by NCR to cater the needs of financial
sectors, specially the Banking domain.

It has many Fact tabls and dimension tbls.

The major fact tbls include Party (Customers of the bank),
Product (Services offered by the bank), Event (An event
occurred ex: An ATM transaction), Agreement (A deal betweem
the party and the bank for a product)etc...

The FSLDM can be customized to cater to bank's specific needs.


   Question: what is the difference between start schma and Fsldm?
Answer
# 2 FSLDM -> 3rd normal form
Star Schema --> Denormalized for query performance


   Question: what is the difference between start schma and Fsldm?
Answer
# 3 FLSDM is a 3nf model, vanilla made to cater a Bank's EDW
need.

Star Schema is a denormalized model


Question: can we have an unconnected lkp to lookup a DB2 record against a Teradata record?
Answer
# 1 Hi,
we can use an unconnected lookup to lookup on any DB
because we specify the connection string details in the
lookup transformation !

Question:
How many error tables are there in fload and what are their significance/use?
Can we see the data of error tables?
How many error tables are their in mload and what is there use?
When mload job fails, can we access mload tables? If yes then how?

Answers:
Fload uses 2 error tables
Error table 1: where format of data is not correct.
Error table 2: violations of UPI

Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table
1. ET TABLE - Data error
MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.

2. UV TABLE - UPI violations
MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task

3. WORK TABLE - WT
Mload loads the selected records in the work table

4. LOG TABLE
A log table maintains record of all checkpoints related to the load job, it is essential/madatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

Question:
After creating tables dynamically in the Teardata, where is the GRANT table option usually done ? When tables are newly created, what is the default role and what the default privileges which get assigned ?
Answer:
The GRANT option for any particular table depends on the privilages of the user. If it is an admin user you can grant privilages at any point of time.

The deafult roles associated with the newly created tables depend on he schema in which they are created.

Question :
What is the difference between Multiload & Fastload interms of Performance?
Answers:

If you want to load, empty table then you use the fastload, so it will very usefull than the mutiload..because fastload performs the loading of the data in 2phase..and it noneed a work table for loading the data.., so it is faster as well as it follows the below steps to load the data in the table

Phase1-It moves all the records to all the AMP first without any hashing
Phase2-After giving endloading command, Amp will hashes the record and send it to the appropriate AMPS .

Multiload:

It does the loading in the 5 phases
Phase1:It will get the import file and checks the script
Phase2:It reads the record from the base table and store in the work table
Phase3:In this Application phase it locks the table header
Phase4:In the DML opreation will done in the tables
Phase 5: In this table locks will be released and work tables will be dropped.

Question :
what is explain in teradata?
Answers:

The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the steps choosen by the optimizer to execute an SQL statement.It may be used oin any valid teradata database with a preface called "EXPLAIN".

The following is an example:-

EXPLAIN select last_name,first_name FROM employees;

The EXPLAIN parses the SQL statement but does not execute it.

This provides the designer with an "execution stratergy".

The execution stratergy provides what an optimizer does but not why it choses them.

The EXPLAIN facility is used to analyze all joins and complex queries.

Question :
what is an optimization and performance tuning and how does it really work in practical projects. can i get any example to better understand.
Answers:

Performance tuning and optimization of a query involves collecting statistics on join columns, avoiding cross product join, selection of appropriate primary index (to avoid skewness in storage) and using secondary index.

Avoiding NUSI is advisable.

Question :
How many codd's rules are satisfied by teradata database?
Answers:

There are 12 codd's rules applied to the teradata database

Question :
What is cliques? What is Vdisk and how it will communicate with physical data storage at the time of data retrieval through AMP ?
Answers:

A clique is a set of Teradata nodes that share a common set of disk arrays. Cabling a subset of nodes to the same disk arrays creates a clique.

Each AMP vproc must have access to an array controller, which in turn accesses the physical disks. AMP vprocs are associated with one or more ranks
(or mirrored pairs) of data. The total disk space associated with an AMP is called a vdisk. A vdisk may have up to three ranks. Hence Vdisk will communicate with physical storage through array controllers.

Question :
In which phase of the Active Data Warehouse evolution do you use data to determine what will happen?
Answers:

It is stage3 - Predicting: Sophisticated analysts heavily utilize the system to leverage information to predict what will happen next in the business to proactively manage the organizationâ€™s strategy. This stage requires data mining tools and building predictive models using historical detail. As an example, users can model customer demographics for target marketing.

Question :
What are two examples of an OLTP environment?
Answers:

The two expamples of OLTP aQuestion:

1- ATM.

2- POS.

Question :
what is basic teradata query language?
Answers:

BTEQ(Basic teradata query)

It allows us to write SQL statements along with BTEQ commands. We can use BTEQ for importing,exporting and reporting purposes.

The commands start with a (.) dot and can be terminated by using (;), it is not mandatory to use (;). SQL statements doesnt start with a dot , but (;) is compulsary to terminate the SQL statement.

BTEQ will assume any thing written with out a dot as a sql statement and requires a (;) to terminate it.

How does indexing improve query performance?   View Answer
Answers:

Indexing is a way to physically reorganise the records to enable some frequently used queries to run faster.

The index can be used as a pointer to the large table. It helps to locate the required row quickly and then return ot back to the user.

or

The frequesntly used queries need not hit a large table for data. they can get what they want from the index itself. - cover queries.

Index comes with the overhead of maintanance. Teradata maintains its index by itself. Each time an insert/update/delete is done on the table the indexes will also need to be updated and maintained.

Indexes cannot be accessed directly by users. Only the optimizer has access to the index.

Does SDLC changes when you use Teradata instead of Oracle?   View Answer
Answers:

If you are going to use teradata instead of oracle

If the teradata is going to be only a data base means It won’t change the System development life cycle (SDLC)

If you are going to use the teradata utilities then it will change the Architecture or SDLC

If your schema is going to be in 3NF then there won’t be huge in change

--------------------------------------------------------------------------------

Which two can be implemented as a centrally located logical architecture? è operational data store è logical data mart   View Answer
Answers:

1- Logical Data Mart Repository

2- Provides a single version of the truth

Can we have an unconnected lkp to lookup a DB2 record against a Teradata record? doesnt seem to work. I could be wrong.   View Answer
Answers:

Yes we can ..but some times we need to check the value which lookup is returning .....like it happens in teradata if we are having lookup on varchar column and record is coming from source system it never works because when informatica creates cache from there it adds space to the values.so we always have to use ltrim(rtrim)) function in lookup query.

Which two statements are true about a foreign key? Each Foreign Key must exist as a Primary Key. Foreign Keys can change values over time.   View Answer
Answers:

first : True
second : False

What is the difference between start schma and Fsldm?   View Answer
Answers:

FSLDM --> Financial Services Logical Data Model (Teradata FS-LDM 7.0) its developed by Teradata for the financial sectors (Speciafially for Bank). it can be customised based on the user requirement.

StarSchma --> Its a relational database schema for representing multidimensional data. The data is stored in a central fact table, with one or more tables holding information on each dimension.

What is a common data source for the central enterprise data warehouse?   View Answer
Answers:

ODS=>Operational Data Source

Can we have an unconnected lkp to lookup a DB2 record against a Teradata record? doesnt seem to work. I could be wrong.
Answers:

Yes we can ..but some times we need to check the value which lookup is returning .....like it happens in teradata if we are having lookup on varchar column and record is coming from source system it never works because when informatica creates cache from there it adds space to the values.so we always have to use ltrim(rtrim)) function in lookup query.

Set and Multi Set Tables
What are set tables and multiset tables in Teradata?Explain with an appropriate example?
1)      Set tables cannot accept duplicate at row level not Index or key level.

Exmaple of rows for set table:

R1 c2 c3 c4 .. cn
1    2   3 4 ... 9      Accepted
1    2   3 4 ... 9      Duplicate is Rejected
2    1   2 4 ... 9
3    2   4 4 ... 9
4    3   4 4 ... 9

2)      Musti set Tables can accept duplicate at row level not Index or key level.

Exmaple of rows for multi set table:

R1 c2 c3 c4 .. cn
1    2   3 4 ... 9
1    2   3 4 ... 9       Duplicate is Accepted 2    1   2 4 ... 9
3    2   4 4 ... 9         3    2   4 4 ... 9       Duplicate is Accepted

Join Stratagies
There are 2 tables, table A with 10 million records, table B has 100 million records, now we are joining both tables, when we seen Explain Plan the plan showing TD will took the table A and it will redistributes it
Now the Question is: By that plan is the optimizer is correct job or not ? Justify Ans
2. From the same above example now the optimizer is taking Table B (100 million records) and it is distributing it,
Now is the optimizer is doing best? and How you avoid this situation

Answer:
Teradata is smart enough to decide when to redistribute and when to copy....
It compares the tables. Are they comparable? or one is big as compared to the other?
Based on simple logic it decides whether to distribute the smaller table on all the AMPs or to copy....what I mean is the small table is copied into all the AMPs in the SPOOL space...Remember all always the JOINs has to take place on the AMPs SPOOL Space...By redistributing it is making sure that the 100 million rows table gets the feeling that it is making AMP local JOIN...

Remember the basic thing what ever Teradata does...it does keeping in consideration for Space and Performance and not to forget the Effiiciency...

My simple formula: If the table is small redistribute them to all the AMPs to have the AMP local Join. Always JOINs are made AMP local if it cannot then you have the high chance of running out of SPOOL space.

Fload, Mload and error tables
How many error tables are there in fload and what are their significance/use?
Can we see the data of error tables?
How many error tables are their in mload and what is there use?
When mload job fails, can we access mload tables? If yes then how?

Answer:
Fload uses 2 error tables

Error table 1: where format of data is not correct.

Error table 2: violations of UPI

Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table

1. ET TABLE - Data error

MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task.

2. UV TABLE - UPI violations

MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task

3. WORK TABLE - WT

Mload loads the selected records in the work table

4. LOG TABLE

A log table maintains record of all checkpoints related to the load job, it is essential/madatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.

Remaining Answers will be adding soon

Learn Teradata

Social Icons

Pages

Popular Posts

Followers

About Me

Coming Soon!

Interview Questions

2 comments:

Blog Archive

Social Icons