Map database schema in Power BI
up vote
6
down vote
favorite
I've come across a video on youtube that describes How to Easily Map Your Database Schema in Power BI using the AdventureWorks database from Microsoft. Now I'm trying to replicate that example using another database. The problem is that many of my columns have got similar content, but different column names with prefixes such as pk_
or fk_
depending on which tables they are located in. And that causes the following query to fail:
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
Does anyone know if it's possible to fuzzy match column names or taking different prefixes into account to make this work? The very same question has been asked directly to the youtube OP. It can also be found on reddit.com, but the question remains unanswered.
I'm trying to wrap my head around some more advanced Power BI features and at the same time learn some much needed SQL, and I thought this would be a cool place to start, so any help is much appreciated!
sql sql-server tsql powerbi
add a comment |
up vote
6
down vote
favorite
I've come across a video on youtube that describes How to Easily Map Your Database Schema in Power BI using the AdventureWorks database from Microsoft. Now I'm trying to replicate that example using another database. The problem is that many of my columns have got similar content, but different column names with prefixes such as pk_
or fk_
depending on which tables they are located in. And that causes the following query to fail:
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
Does anyone know if it's possible to fuzzy match column names or taking different prefixes into account to make this work? The very same question has been asked directly to the youtube OP. It can also be found on reddit.com, but the question remains unanswered.
I'm trying to wrap my head around some more advanced Power BI features and at the same time learn some much needed SQL, and I thought this would be a cool place to start, so any help is much appreciated!
sql sql-server tsql powerbi
add a comment |
up vote
6
down vote
favorite
up vote
6
down vote
favorite
I've come across a video on youtube that describes How to Easily Map Your Database Schema in Power BI using the AdventureWorks database from Microsoft. Now I'm trying to replicate that example using another database. The problem is that many of my columns have got similar content, but different column names with prefixes such as pk_
or fk_
depending on which tables they are located in. And that causes the following query to fail:
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
Does anyone know if it's possible to fuzzy match column names or taking different prefixes into account to make this work? The very same question has been asked directly to the youtube OP. It can also be found on reddit.com, but the question remains unanswered.
I'm trying to wrap my head around some more advanced Power BI features and at the same time learn some much needed SQL, and I thought this would be a cool place to start, so any help is much appreciated!
sql sql-server tsql powerbi
I've come across a video on youtube that describes How to Easily Map Your Database Schema in Power BI using the AdventureWorks database from Microsoft. Now I'm trying to replicate that example using another database. The problem is that many of my columns have got similar content, but different column names with prefixes such as pk_
or fk_
depending on which tables they are located in. And that causes the following query to fail:
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
Does anyone know if it's possible to fuzzy match column names or taking different prefixes into account to make this work? The very same question has been asked directly to the youtube OP. It can also be found on reddit.com, but the question remains unanswered.
I'm trying to wrap my head around some more advanced Power BI features and at the same time learn some much needed SQL, and I thought this would be a cool place to start, so any help is much appreciated!
sql sql-server tsql powerbi
sql sql-server tsql powerbi
edited Nov 7 at 21:35
Lukasz Szozda
76.5k1059101
76.5k1059101
asked Nov 5 at 13:44
vestland
3,21531942
3,21531942
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
4
down vote
accepted
If you want to show relationships between tables then using common column names between two tables is not best idea.
For example:
CREATE TABLE tab(id INT PRIMARY KEY, name INT);
CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT);
-- completely unrelated tables
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
+-------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+-------------+-------------+
| tab | name |
| tab2 | name |
+-------------+-------------+
db<>fiddle demo
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name,
[column] = col1.name,
[referenced_table] = tab2.name,
[referenced_column] = col2.name
FROM sys.foreign_key_columns fkc
JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id
JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id
JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id
JOIN sys.columns col1 ON col1.column_id = parent_column_id
AND col1.object_id = tab1.object_id
JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id
JOIN sys.columns col2 ON col2.column_id = referenced_column_id
AND col2.object_id = tab2.object_id;
db<>fiddle demo2
Then you need to choose appropriate visualisation method in PowerBI.
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references fromINFORMATION_SCHEMA.COLUMNS
. Or something like that?
– vestland
Nov 12 at 8:02
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
4
down vote
accepted
If you want to show relationships between tables then using common column names between two tables is not best idea.
For example:
CREATE TABLE tab(id INT PRIMARY KEY, name INT);
CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT);
-- completely unrelated tables
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
+-------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+-------------+-------------+
| tab | name |
| tab2 | name |
+-------------+-------------+
db<>fiddle demo
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name,
[column] = col1.name,
[referenced_table] = tab2.name,
[referenced_column] = col2.name
FROM sys.foreign_key_columns fkc
JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id
JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id
JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id
JOIN sys.columns col1 ON col1.column_id = parent_column_id
AND col1.object_id = tab1.object_id
JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id
JOIN sys.columns col2 ON col2.column_id = referenced_column_id
AND col2.object_id = tab2.object_id;
db<>fiddle demo2
Then you need to choose appropriate visualisation method in PowerBI.
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references fromINFORMATION_SCHEMA.COLUMNS
. Or something like that?
– vestland
Nov 12 at 8:02
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
add a comment |
up vote
4
down vote
accepted
If you want to show relationships between tables then using common column names between two tables is not best idea.
For example:
CREATE TABLE tab(id INT PRIMARY KEY, name INT);
CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT);
-- completely unrelated tables
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
+-------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+-------------+-------------+
| tab | name |
| tab2 | name |
+-------------+-------------+
db<>fiddle demo
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name,
[column] = col1.name,
[referenced_table] = tab2.name,
[referenced_column] = col2.name
FROM sys.foreign_key_columns fkc
JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id
JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id
JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id
JOIN sys.columns col1 ON col1.column_id = parent_column_id
AND col1.object_id = tab1.object_id
JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id
JOIN sys.columns col2 ON col2.column_id = referenced_column_id
AND col2.object_id = tab2.object_id;
db<>fiddle demo2
Then you need to choose appropriate visualisation method in PowerBI.
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references fromINFORMATION_SCHEMA.COLUMNS
. Or something like that?
– vestland
Nov 12 at 8:02
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
add a comment |
up vote
4
down vote
accepted
up vote
4
down vote
accepted
If you want to show relationships between tables then using common column names between two tables is not best idea.
For example:
CREATE TABLE tab(id INT PRIMARY KEY, name INT);
CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT);
-- completely unrelated tables
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
+-------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+-------------+-------------+
| tab | name |
| tab2 | name |
+-------------+-------------+
db<>fiddle demo
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name,
[column] = col1.name,
[referenced_table] = tab2.name,
[referenced_column] = col2.name
FROM sys.foreign_key_columns fkc
JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id
JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id
JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id
JOIN sys.columns col1 ON col1.column_id = parent_column_id
AND col1.object_id = tab1.object_id
JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id
JOIN sys.columns col2 ON col2.column_id = referenced_column_id
AND col2.object_id = tab2.object_id;
db<>fiddle demo2
Then you need to choose appropriate visualisation method in PowerBI.
If you want to show relationships between tables then using common column names between two tables is not best idea.
For example:
CREATE TABLE tab(id INT PRIMARY KEY, name INT);
CREATE TABLE tab2(id2 INT PRIMARY KEY, name INT);
-- completely unrelated tables
SELECT
c.TABLE_NAME
,c.COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN
(SELECT
COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
GROUP BY COLUMN_NAME
HAVING COUNT(*) > 1
) dupes
ON dupes.COLUMN_NAME = c.COLUMN_NAME
+-------------+-------------+
| TABLE_NAME | COLUMN_NAME |
+-------------+-------------+
| tab | name |
| tab2 | name |
+-------------+-------------+
db<>fiddle demo
I would propose to use proper metadata views i.e. sys.foreign_key_columns:
SELECT [table] = tab1.name,
[column] = col1.name,
[referenced_table] = tab2.name,
[referenced_column] = col2.name
FROM sys.foreign_key_columns fkc
JOIN sys.objects obj ON obj.object_id = fkc.constraint_object_id
JOIN sys.tables tab1 ON tab1.object_id = fkc.parent_object_id
JOIN sys.schemas sch ON tab1.schema_id = sch.schema_id
JOIN sys.columns col1 ON col1.column_id = parent_column_id
AND col1.object_id = tab1.object_id
JOIN sys.tables tab2 ON tab2.object_id = fkc.referenced_object_id
JOIN sys.columns col2 ON col2.column_id = referenced_column_id
AND col2.object_id = tab2.object_id;
db<>fiddle demo2
Then you need to choose appropriate visualisation method in PowerBI.
edited Nov 7 at 21:38
answered Nov 7 at 21:31
Lukasz Szozda
76.5k1059101
76.5k1059101
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references fromINFORMATION_SCHEMA.COLUMNS
. Or something like that?
– vestland
Nov 12 at 8:02
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
add a comment |
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references fromINFORMATION_SCHEMA.COLUMNS
. Or something like that?
– vestland
Nov 12 at 8:02
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references from
INFORMATION_SCHEMA.COLUMNS
. Or something like that?– vestland
Nov 12 at 8:02
Thank you for answering! Is your suggestion limited to a certain number of tables? Or will the query somehow map all tables and references from
INFORMATION_SCHEMA.COLUMNS
. Or something like that?– vestland
Nov 12 at 8:02
1
1
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
@vestland It will return all tables that have FK relationships for particular database.
– Lukasz Szozda
Nov 12 at 16:18
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53155621%2fmap-database-schema-in-power-bi%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown