Home » Extract and Group

Extract and Group

Extract the Departments and their Sub-department Department name is before the 1st hyphen Solution should be in a Cross-tab as shown Dynamic array function allowed but Extra marks for Legacy solutions or PowerQuery Solution

📌 Challenge Details and Links
Challenge Number: 70
Challenge Difficulty: ⭐
📥Download Sample File
📥Link to the solutions on LinkedIn

Solving the challenge of Extract and Group with Power Query

Power Query solution 1 for Extract and Group, proposed by Kris Jaganah:
let
  A = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  B = Table.AddColumn(A, "Ans", each Text.BeforeDelimiter([#"SUB-DEPARTMENT NAMES"], "-")), 
  C = Table.Group(B, "Ans", {"All", each [#"SUB-DEPARTMENT NAMES"]}), 
  D = Table.FromColumns(C[All], C[Ans])
in
  D
Power Query solution 2 for Extract and Group, proposed by Brian Julius:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  AddDept = Table.AddColumn(
    Source, 
    "Dept", 
    each Text.BeforeDelimiter([#"SUB-DEPARTMENT NAMES"], "-")
  ), 
  Group = Table.Group(AddDept, {"Dept"}, {{"All", each [#"SUB-DEPARTMENT NAMES"]}}), 
  FromCols = Table.FromColumns(Group[All], Group[Dept])
in
  FromCols
Power Query solution 3 for Extract and Group, proposed by Abdallah Ally:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  Result = [
    a = Source[#"SUB-DEPARTMENT NAMES"], 
    b = List.Distinct(List.Transform(a, each Text.BeforeDelimiter(_, "-"))), 
    c = List.Transform(b, each List.Select(a, (x) => Text.StartsWith(x, _))), 
    d = Table.FromColumns(c, b)
  ][d]
in
  Result
Power Query solution 4 for Extract and Group, proposed by Meganathan Elumalai:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  AdCol = Table.AddColumn(
    Source, 
    "Department", 
    each Text.BeforeDelimiter([#"SUB-DEPARTMENT NAMES"], "-")
  ), 
  Group = Table.Group(AdCol, {"Department"}, {{"Count", each _[#"SUB-DEPARTMENT NAMES"]}}), 
  Result = Table.FromColumns(Group[Count], Group[Department])
in
  Result
Power Query solution 5 for Extract and Group, proposed by Meganathan Elumalai:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  Result = [
    nm = Table.ToList(Source), 
    dpt = List.Distinct(List.Transform(nm, (f) => Text.BeforeDelimiter(f, "-"))), 
    fin = Table.FromColumns(
      List.Transform(dpt, (x) => List.Select(nm, each Text.Contains(_, x))), 
      dpt
    )
  ][fin]
in
  Result
Power Query solution 6 for Extract and Group, proposed by Krzysztof Kominiak:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  A = Table.AddColumn(Source, "tmp", each Text.Split([#"SUB-DEPARTMENT NAMES"], "-"){0}), 
  B = Table.ExpandTableColumn(
    Table.Group(A, {"tmp"}, {{"count", each Table.AddIndexColumn(_, "Id", 1), type table}}), 
    "count", 
    {"SUB-DEPARTMENT NAMES", "Id"}
  ), 
  C = Table.Pivot(B, List.Distinct(B[tmp]), "tmp", "SUB-DEPARTMENT NAMES"), 
  Result = Table.RemoveColumns(C, {"Id"})
in
  Result
Power Query solution 7 for Extract and Group, proposed by Md. Shah Alam, Microsoft Certified Trainer:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  a = Table.AddColumn(
    Source, 
    "Sub-department", 
    each Text.BeforeDelimiter([#"SUB-DEPARTMENT NAMES"], "-"), 
    type text
  ), 
  b = Table.Group(a, {"Sub-department"}, {{"AllRows", each Table.AddIndexColumn(_, "SL", 1, 1)}})[
    [AllRows]
  ], 
  c = Table.ExpandTableColumn(
    b, 
    "AllRows", 
    {"SUB-DEPARTMENT NAMES", "Sub-department", "SL"}, 
    {"SUB-DEPARTMENT NAMES", "Sub-department", "SL"}
  ), 
  d = Table.Pivot(c, List.Distinct(c[#"Sub-department"]), "Sub-department", "SUB-DEPARTMENT NAMES"), 
  e = Table.RemoveColumns(d, {"SL"})
in
  e
Power Query solution 8 for Extract and Group, proposed by Zain Shah:
let
  Source = Excel.CurrentWorkbook(){[Name = "SubDepts"]}[Content], 
  Group = Table.Group(
    Source, 
    "SUB-DEPARTMENT NAMES", 
    {
      "All", 
      each 
        let
          a = _, 
          b = {List.Transform(a[#"SUB-DEPARTMENT NAMES"], each Text.BeforeDelimiter(_, "-")){0}}, 
          c = a[#"SUB-DEPARTMENT NAMES"], 
          d = b & c
        in
          d
    }, 
    0, 
    (x, y) => Value.Compare(Text.BeforeDelimiter(y, "-"), Text.BeforeDelimiter(x, "-"))
  )[All], 
  Result = Table.PromoteHeaders(Table.FromColumns(Group))
in
  Result
Power Query solution 9 for Extract and Group, proposed by Ruslan Vdovychenko:
let
  In = Excel.CurrentWorkbook(){[Name = "Table1"]}[Content], 
  #"1" = Table.AddColumn(In, "Extr", each Text.BeforeDelimiter([#"SUB-DEPARTMENT NAMES"], "-")), 
  #"2" = Table.Group(
    #"1", 
    {"Extr"}, 
    {{"Sub", each _, type table [#"SUB-DEPARTMENT NAMES" = text, Extr = text]}}
  ), 
  #"3" = Table.AddColumn(#"2", "AddIn", each Table.AddIndexColumn([Sub], "No", 1)), 
  #"4" = Table.RemoveColumns(#"3", {"Extr", "Sub"}), 
  #"5" = Table.ExpandTableColumn(
    #"4", 
    "AddIn", 
    {"SUB-DEPARTMENT NAMES", "Extr", "No"}, 
    {"SUB-DEPARTMENT NAMES", "Extr", "No"}
  ), 
  #"6" = Table.Pivot(#"5", List.Distinct(#"5"[Extr]), "Extr", "SUB-DEPARTMENT NAMES"), 
  #"7" = Table.RemoveColumns(#"6", {"No"})
in
  #"7"

Solving the challenge of Extract and Group with Excel

Excel solution 1 for Extract and Group, proposed by Rick Rothstein:
=LET(
   t,
   TEXTBEFORE(
       B4:B13,
       "-"),
   d,
   TOROW(
       UNIQUE(
           t)),
   VSTACK(
       TOROW(
           d),
       IFNA(
           DROP(
               REDUCE(
                   "",
                   d,
                   LAMBDA(
                       a,
                       x,
                       HSTACK(
                           a,
                           FILTER(
                               B4:B13,
                               t=x)))),
               ,
               1),
           "")))
Excel solution 2 for Extract and Group, proposed by Kris Jaganah:
=LET(
   a,
   SubDepts[SUB-DEPARTMENT NAMES],
   b,
   TEXTBEFORE(
       a,
       "-"),
   c,
   SEQUENCE(
       ROWS(
           b)),
   d,
   XMATCH(
       b,
       b),
   DROP(
       PIVOTBY(
           c-d,
           HSTACK(
               d,
               b),
           a,
           SINGLE,
           ,
           0,
           ,
           0),
       1,
       1))
Excel solution 3 for Extract and Group, proposed by Hussein SATOUR:
=LET(
   n,
   B4:B13,
   a,
   TEXTBEFORE(
       n,
       "-"),
   b,
   CONCAT(
       MAP(
           UNIQUE(
               a),
           LAMBDA(
               x,
               x&", "&ARRAYTOTEXT(
                   FILTER(
                       n,
                       a=x))&"/"))),
   TRANSPOSE(
       TEXTSPLIT(
           b,
           ", ",
           "/",
           1,
           ,
           "")))
Excel solution 4 for Extract and Group, proposed by Oscar Mendez Roca Farell:
=LET(
   s,
   SubDepts[SUB-DEPARTMENT NAMES],
   t,
   TEXTSPLIT(
       s,
       "-"),
   x,
   XMATCH(
       t,
       t),
   DROP(
       PIVOTBY(
           ROW(
               s)-x,
           HSTACK(
               x,
               t),
           s,
           SINGLE,
           ,
           0,
           ,
           0),
       1,
       1))
Excel solution 5 for Extract and Group, proposed by Duy Tùng:
=LET(
   a,
   B4:B13,
   b,
   TEXTBEFORE(
       a,
       "-"),
   DROP(
       REDUCE(
           0,
           UNIQUE(
               b),
           LAMBDA(
               x,
               v,
               IFNA(
                   HSTACK(
                       x,
                       VSTACK(
                           v,
                           FILTER(
                               a,
                               b=v))),
                   ""))),
       ,
       1))
Excel solution 6 for Extract and Group, proposed by Sunny Baggu:
=LET(
   
    _t,
    SubDepts[SUB-DEPARTMENT NAMES],
   
    _a,
    TEXTBEFORE(
        _t,
         "-"),
   
    _h,
    TOROW(
        UNIQUE(
            _a)),
   
    VSTACK(
        
         _h,
        
         IFNA(
             
              DROP(
                  REDUCE(
                      "",
                       _h,
                       LAMBDA(
                           a,
                            v,
                            HSTACK(
                                a,
                                 FILTER(
                                     _t,
                                      _a = v)))),
                   ,
                   1),
             
              ""
              )
         )
   )
Excel solution 7 for Extract and Group, proposed by Pieter de B.:
=LET(
   b,
   B4:B13,
   L,
   LAMBDA(
       x,
       TEXTBEFORE(
           x,
           "-")),
   c,
   L(
       b),
   DROP(
       PIVOTBY(
           MAP(
               b,
               LAMBDA(
                   m,
                   COUNTIF(
                       B4:m,
                       L(
                           m)&"*"))),
           HSTACK(
               XMATCH(
                   c,
                   c),
               c),
           b,
           SINGLE,
           ,
           0,
           ,
           0),
       1,
       1))
Excel solution 8 for Extract and Group, proposed by Hamidi Hamid:
=LET(
   x,
   TEXTBEFORE(
       B4:B13,
       "-"),
   y,
   TEXTAFTER(
       B4:B13,
       "-",
       1),
   ux,
   UNIQUE(
       x),
   t,
   HSTACK(
       ux,
       MAP(
           ux,
           LAMBDA(
               a,
               ARRAYTOTEXT(
                   FILTER(
                       y,
                       x=a))))),
   k,
   IFERROR(
       DROP(
           REDUCE(
               0,
               TAKE(
                   t,
                   ,
                   -1),
               LAMBDA(
                   a,
                   b,
                   VSTACK(
                       a,
                       TEXTSPLIT(
                           b,
                           ", ",
                           )))),
           1),
       ""),
   TRANSPOSE(
       HSTACK(
           TAKE(
               t,
               ,
               1),
           IF(
               k="",
               "",
               TAKE(
               t,
               ,
               1)&"-"&k))))
Excel solution 9 for Extract and Group, proposed by Asheesh Pahwa:
=LET(
   s,
   B4:B13,
   t,
   TEXTBEFORE(
       s,
       "-"),
   d,
   TOROW(
       UNIQUE(
           t)),
   
   VSTACK(
       d,
       IFNA(
           DROP(
               REDUCE(
                   "",
                   d,
                   LAMBDA(
                       a,
                       v,
                       HSTACK(
                           a,
                           LET(
                               f,
                               ISNUMBER(
                                   FIND(
                                       v,
                                       s)),
                               FILTER(
                                   s,
                                   f))))),
               ,
               1),
           "")))
Excel solution 10 for Extract and Group, proposed by ferhat CK:
=LET(
   a,
   TEXTBEFORE(
       SubDepts[SUB-DEPARTMENT NAMES],
       "-"),
   b,
   UNIQUE(
       a),
   IFNA(
       DROP(
           REDUCE(
               0,
               b,
               LAMBDA(
                   x,
                   y,
                   HSTACK(
                       x,
                       VSTACK(
                           y,
                           FILTER(
                               SubDepts[SUB-DEPARTMENT NAMES],
                               TEXTBEFORE(
                                   SubDepts[SUB-DEPARTMENT NAMES],
                                   "-")=y))))),
           ,
           1),
       ""))
Excel solution 11 for Extract and Group, proposed by Meganathan Elumalai:
=LET(
   x,
   B4:B13,
   nm,
   TEXTBEFORE(
       x,
       "-"),
   dpt,
   UNIQUE(
       nm),
   VSTACK(
       TOROW(
           dpt),
       IFNA(
           DROP(
               REDUCE(
                   "",
                   dpt,
                   LAMBDA(
                       a,
                       v,
                       HSTACK(
                           a,
                           FILTER(
                               x,
                               nm=v)))),
               ,
               1),
           "")))
Excel solution 12 for Extract and Group, proposed by Gerson Pineda:
=LET(
   m,
   SubDepts[SUB-DEPARTMENT NAMES],
   d,
   TEXTBEFORE(
       m,
       "-"),
   IFNA(
       DROP(
           REDUCE(
               1,
               UNIQUE(
                   d),
               LAMBDA(
                   j,
                   x,
                   HSTACK(
                       j,
                       VSTACK(
                           x,
                           FILTER(
                               m,
                               d=x))))),
           ,
           1),
       ""))
Excel solution 13 for Extract and Group, proposed by Anjan Kumar Bose:
=FILTER(
   $A$2:$A$11,
   LEFT(
       $A$2:$A$11,
       LEN(
           E17))=E17,
   "")
Excel solution 14 for Extract and Group, proposed by Tomasz Jakóbczyk:
=TRANSPOSE(
   UNIQUE(
       TEXTBEFORE(
           B4:B13,
           "-")))

D4 and to drag to the right: =FILTER(
   SubDepts[SUB-DEPARTMENT NAMES],
   IFERROR(
       FIND(
           D3,
           SubDepts[SUB-DEPARTMENT NAMES]),
       0)>0)
Excel solution 15 for Extract and Group, proposed by CA Mohit Saxena:
=y)))),
   ,
   1))),
   "")
Excel solution 16 for Extract and Group, proposed by Mohit Rawat:
=VSTACK(
   TRANSPOSE(
       UNIQUE(
           TEXTBEFORE(
               SubDepts[SUB-DEPARTMENT NAMES],
               "-"))),
   MAKEARRAY(
       3,
       4,
       LAMBDA(
           r,
           c,
           
            IF(
                c=1,
                 IFERROR(
                     INDEX(
                         FILTER(
                             SubDepts[SUB-DEPARTMENT NAMES],
                             LEFT(
                                 SubDepts[SUB-DEPARTMENT NAMES],
                                 17)="Sales & Marketing"),
                         r),
                     ""),
                
                 IF(
                     c=2,
                      IFERROR(
                          INDEX(
                              FILTER(
                                  SubDepts[SUB-DEPARTMENT NAMES],
                                  LEFT(
                                      SubDepts[SUB-DEPARTMENT NAMES],
                                      11)="Procurement"),
                              r),
                          ""),
                     
                      IF(
                          c=3,
                           IFERROR(
                               INDEX(
                                   FILTER(
                                       SubDepts[SUB-DEPARTMENT NAMES],
                                       LEFT(
                                           SubDepts[SUB-DEPARTMENT NAMES],
                                           5)="Admin"),
                                   r),
                               ""),
                          
                           IF(
                               c=4,
                                IFERROR(
                                    INDEX(
                                        FILTER(
                                            SubDepts[SUB-DEPARTMENT NAMES],
                                            LEFT(
                                                SubDepts[SUB-DEPARTMENT NAMES],
                                                7)="Finance"),
                                        r),
                                    "")
                                )))))))

Solving the challenge of Extract and Group with Python

Python solution 1 for Extract and Group, proposed by Konrad Gryczan, PhD:
import pandas as pd
path = "files/CHALLENGE 1205.xlsx"
input = pd.read_excel(path, usecols="B", skiprows=2, nrows=11)
test = pd.read_excel(path, usecols="D:G", skiprows=2, nrows=4)
input[['sub_department', 'department']] = input['SUB-DEPARTMENT NAMES'].str.split('-', n=1, expand=True)
input['rn'] = input.groupby('sub_department').cumcount() + 1
input = input.drop(columns=['department'])
result = input.pivot(index='rn', columns='sub_department', values='SUB-DEPARTMENT NAMES').reindex(columns=input['sub_department'].unique()).reset_index(drop=True)
result.index.name = None
result.columns.name = None
print(result.equals(test)) # True
Python solution 2 for Extract and Group, proposed by Luan Rodrigues:
import pandas as pd
file = r"CHALLENGE 1205.xlsx"
df = pd.read_excel(file,usecols="B",skiprows=2)
names = df['SUB-DEPARTMENT NAMES'].str.split('-').str[:1].explode().unique()
lista_dfs = [df[df['SUB-DEPARTMENT NAMES'].str.startswith(name)] for name in names]
df_final = pd.concat(lista_dfs, axis=1)
df_final.columns = names
def limpar_nans_colunas(df):
 df_limpo = pd.DataFrame({col: df[col].dropna().reset_index(drop=True) for col in df.columns })
 return df_limpo
df_final = limpar_nans_colunas(df_final)
print(df_final)

Solving the challenge of Extract and Group with Python in Excel

Python in Excel solution 1 for Extract and Group, proposed by Alejandro Campos:
data = xl("SubDepts[SUB-DEPARTMENT NAMES]")[0]
df = pd.DataFrame([item.split('-', 1) for item in data], columns=['Department', 'Sub-department'])
df['Sub-department'] = df['Department'] + '-' + df['Sub-department']
cross_tab = df.pivot_table(index=df.groupby('Department').cumcount(), columns='Department', values='Sub-department', aggfunc=' '.join, fill_value='')
cross_tab = cross_tab[["SALES & MARKETING", "PROCUREMENT", "ADMIN", "FINANCE"]]
cross_tab.columns.name = None
cross_tab

Solving the challenge of Extract and Group with R

R solution 1 for Extract and Group, proposed by Konrad Gryczan, PhD:
library(tidyverse)
library(readxl)
path = "files/CHALLENGE 1205.xlsx"
input = read_excel(path, range = "B3:B13")
test  = read_excel(path, range = "D3:G7")
result = input %>%
 separate(`SUB-DEPARTMENT NAMES`, into = c("sub_department", "department"), sep = "-", extra = "merge", remove = F) %>%
 mutate(rn = row_number(), .by = sub_department) %>%
 select(-department) %>%
 pivot_wider(names_from = sub_department, values_from = `SUB-DEPARTMENT NAMES`) %>%
 select(-rn)
all.equal(result, test, check.attributes = FALSE)
#> [1] TRUE

Leave a Reply