c# – Generic Method to Flatten a Collection of Nested Objects into a DataTable?

If you’re only working with one type of business object (in this case, Customer), then I recommend @Huntbook’s answer, as that s simplifies this problem.

That said, if you truly need for this to be a generic method because eg, you’ll be handling a variety of different business objects (ie, not exclusively Customer), then you can certainly expand your proposed CreateDataTableFromAnyCollection<T>() method to support recursion, though it isn’t a trivial task.

I will walk you through a basic implementation of this below. This will have some limitations, which I’ll discuss at the end. These may or may not matter for your application. Regardless, this will establish a foundation for dynamically translating object graphs into a flattened DataTablewhich you can build off of.

Basic Approach

The recursion process isn’t quite as straight forward as you might expect since you’re looping through a collection of objects, yet only need to determine the definition of the DataTable once.

As a result, it makes more sense to separate out your recursive functionality into two separate methods: One for establishing the schema, the other for populating the DataTable. I propose:

  1. EstablishDataTableFromType()which dynamically establishes the schema of the DataTable based on a given Type (along with any nested types), and
  2. GetValuesFromObject()which, for each (nested) object in your source list, adds the values ​​from each property to a list of values, which can then be added to a DataTable.

Challenges

The Basic Approach Glosses over some challenges, however, which are worth acknowledging. These include:

  1. How do we determine if a property is a collection—and, thus, subject to recursion? We will be able to use the Type.GetGenericTypeDefinition() for this.

  2. If it is a collection, how do we determine what type the collection contains (eg, Order, Item)? We will be able to use the Type.GetGenericArguments() for this.

  3. How do we ensure all data is represented, given that each nested item necessitates an additional row? We will need to establish a new record for every permutation in the object graph.

  4. What happens if you have two collections on an object, as per @DBC’s question in the comments? My code assumes you’ll want a permutation for each. So if you added Addresses to Customerthis might yield something like:

    Customer.Name Customer.Orders.ID Customer.Orders.Items.SKU Customer.Addresses.PostalCode
    Bill Gates 0 001 98052
    Bill Gates 0 002 98052
    Bill Gates 0 001 98039
    Bill Gates 0 002 98039

    That may not be a valid assumption, but you can tweak the logic per your requirements.

  5. What happens if an object has two collections of the same Type? Your proposal influences that the DataColumn names should be delineated by the Type, but that would introduce a naming conflict. To address that, I assume the property name should be used as the delineator, not the property Type.

Solution

The code is fairly complex. I’ll provide a brief summary of each method below. In addition, I’ve added some comments within the code. If you have questions about any specific functionality, however, please ask and I’ll provide further clarification.

EstablishDataTableFromType(): This method will establish a DataTable definition based on a given Type. Instead of simply looping through values, however, this method will recurse over any ICollection<T> types discovered.

/// <summary>
///   Populates a <paramref name="dataTable"/> with <see cref="DataColumn"/> 
///   definitions based on a given <paramref name="type"/>. Optionally prefixes 
///   the <see cref="DataColumn"/> name with a <paramref name="prefix"/> to 
///   handle nested types.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> to derive the <see cref="DataColumn"/> definitions
///   from, based on properties.
/// </param>
/// <param name="dataTable">
///   The <see cref="DataTable"/> to add the <see cref="DataColumn"/> definitions to.
/// </param>
/// <param name="prefix">
///   The prefix to prepend to the <see cref="DataColumn"/> definition.
/// </param>

private static void EstablishDataTableFromType(Type type, DataTable dataTable, string prefix = "") {
    var properties = type.GetProperties();
    foreach (System.Reflection.PropertyInfo property in properties) 
    {
        if (!IsList(property.PropertyType)) 
        {
            dataTable.Columns.Add(
                new DataColumn(
                    prefix + property.Name, 
                    Nullable.GetUnderlyingType(property.PropertyType) ?? property.PropertyType
                )
            );
        }
        else 
        {
            // If the property is a generic list, detect the generic type used 
            // for that list
            var listType = property.PropertyType.GetInterfaces().Where(i => i.IsGenericType && typeof(ICollection<>) == i.GetGenericTypeDefinition()).FirstOrDefault().GetGenericArguments().Last();
            // Recursively call this method in order to define columns for 
            // nested types
            EstablishDataTableFromType(listType, dataTable, prefix + property.Name + ".");
        }
    }
}

GetValuesFromObject(): This method will take a source Object and, for every property, add the value of the property to an object[]. If an Object contains an ICollection<> property, it will recurse over that property, establishing an object[] for every permutation.

/// <summary>
///   Populates a <paramref name="target"/> list with an array of <see cref="
///   object"/> instances representing the values of each property on a <paramref 
///   name="source"/>. 
/// </summary>
/// <remarks>
///   If the <paramref name="source"/> contains a nested <see cref="ICollection{T}"/>,
///   then this method will be called recursively, resulting in a new record for
///   every nested <paramref name="source"/> in that <see cref="List{T}"/>.
/// </remarks>
/// <param name="source">
///   The source <see cref="Object"/> from which to pull the property values.
/// </param>
/// <param name="target">
///   A <see cref="List{T}"/> to store the <paramref name="source"/> values in.
/// </param>
/// <param name="columnIndex">
///   The index associated with the property of the <paramref name="source"/> 
///   object.
/// </param>

private static void GetValuesFromObject(Object source, List<object[]> target, ref int columnIndex) 
{

    var type                = source.GetType();
    var properties          = type.GetProperties();

    for (int i = 0; i < properties.Length; i++) 
    {

        var property        = properties[i];
        var value           = property.GetValue(source, null);
        var baseIndex       = columnIndex;

        // If the property is not a list, write its value to every instance of 
        // the target object. If there are multiple objects, the value should be 
        // written to every permutation
        if (!IsList(property.PropertyType)) 
        {
            foreach (var row in target) 
            {
                row[columnIndex] = value;
            }
            columnIndex++;
        }

        // If the property is a generic list, recurse over each instance of that 
        // object. As part of this, establish copies of the objects in the target
        // storage to ensure that every a new permutation is created for every
        // nested object.
        else 
        {
            var list        = (value as ICollection)!;
            var collated    = new List<Object[]>();
            foreach (var item in list) 
            {
                columnIndex = baseIndex;
                var values  = new List<Object[]>();
                foreach (var baseItem in target) 
                {
                    values.Add((object[])baseItem.Clone());
                }
                GetValuesFromObject(item, values, ref columnIndex);
                collated.AddRange(values);
            }
            target.Clear();
            target.AddRange(collated);
        }
    }
}

CreateDataTableFromAnyCollection: The original method you provided obviously needs to be updated to call the EstablishDataTableFromType() and GetValuesFromObject() methods, thus supporting recursion, instead of simply looping over a flat list of properties. This is easy to do, though it does require a bit of scaffolding given how I’ve written the GetValuesFromObject() signature.

/// <summary>
///   Given a <paramref name="list"/> of <typeparamref name="T"/> objects, will 
///   return a <see cref="DataTable"/> with a <see cref="DataRow"/> representing 
///   each instance of <typeparamref name="T"/>. 
/// </summary>
/// <remarks>
///   If <typeparamref name="T"/> contains any nested <see cref="ICollection{T}"/>, the 
///   schema will be flattened. As such, each instances of <typeparamref name=
///   "T"/> will have one record for every nested item in each <see cref=
///   "ICollection{T}"/>.
/// </remarks>
/// <typeparam name="T">
///   The <see cref="Type"/> that the source <paramref name="list"/> contains a
///   list of.
/// </typeparam>
/// <param name="list">
///   A list of <typeparamref name="T"/> instances to be added to the <see cref=
///   "DataTable"/>.
/// </param>
/// <returns>
///   A <see cref="DataTable"/> containing (at least) one <see cref="DataRow"/> 
///   for each item in <paramref name="list"/>.
/// </returns>

public static DataTable CreateDataTableFromAnyCollection<T>(IEnumerable<T> list) 
{

    var dataTable           = new DataTable();

    EstablishDataTableFromType(typeof(T), dataTable, typeof(T).Name + ".");

    foreach (T source in list) 
    {
        var values          = new List<Object[]>();
        var currentIndex    = 0;

        // Establish an initial array to store the values of the source object
        values.Add(new object[dataTable.Columns.Count]);

        if (source is not null) 
        {
            GetValuesFromObject(source, values, ref currentIndex);
        }

        // If the source object contains nested lists, then multiple permutations 
        // of source object will be returned
        foreach (var value in values) 
        {
            dataTable.Rows.Add(value);
        }

    }

    return dataTable;

}

IsList(): Finally, I’ve added a simple helper method for determining if the Type of a given property is a generic ICollection<> or not. It is used by both EstablishDataTableFromType() as well as GetValuesFromObject(). This rely’s on the Type type’s IsGenericType and GetGenericTypeDefinition().

/// <summary>
///   Simple helper function to determine if a given <paramref name="type"/> is a 
///   generic <see cref="ICollection{T}"/>.
/// </summary>
/// <param name="type">
///   The <see cref="Type"/> to determine if it is an <see cref="ICollection{T}"/>.
/// </param>
/// <returns>
///   Returns <c>true</c> if the <paramref name="type"/> is a generic <see cref=
///   "List{T}"/>.
/// </returns>

private static bool IsList(Type type) => type
    .GetInterfaces()
    .Any(i => i.IsGenericType && i.GetGenericTypeDefinition() == typeof(ICollection<>));

Validation

Here’s a simple unit test (written for XUnit) to validate the functionality. This is a very simple test that just confirms that the number of DataRow instances in the DataTable match the anticipated number of permutations; it doesn’t validate the actual data in each record—though I’ve separately validated that the data is correct:

[Fact]
public void CreateDataTableFromAnyCollection() 
{
    
    // ARRANGE

    var customers           = new List<Customer>();

    // Create an object graph of Customer, Order, and Item instances, three per
    // collection 
    for (var i = 0; i < 3; i++) 
    {
        var customer        = new Customer() {
            Email           = "Customer" + i + "@domain.tld",
            Name            = "Customer " + i
        };
        for (var j = 0; j < 3; j++) 
        {
            var order = new Order() 
            {
                ID = i + "." + j
            };
            for (var k = 0; k < 3; k++) 
            {
                order.Items.Add(
                    new Item() 
                    {
                        Description = "Item " + k,
                        SKU = "0123-" + k,
                        Price = i + (k * .1)
                    }
                );
            }
            customer.Orders.Add(order);
        }
        customers.Add(customer);
    }

    // ACT
    var dataTable = ParentClass.CreateDataTableFromAnyCollection<Customer>(customers);

    // ASSERT
    Assert.Equal(27, dataTable.Rows.Count);

    // CLEANUP VALUES
    dataTable.Dispose();

}

Limitations

This is a complex problem, and my basic implementation has a number of limitations:

  1. Complex Properties: The only type of non-primitive property type supported is the a generic ICollection<> (including eg, List<>, Collection<>, Dictionary<>, KeyCollection<>, &c.). As a result, any other type of object will likely result in an exception—or, at least, unexpected data. A production version would likely want to:

    • Recurse over complex properties that aren’t stored in an ICollection<>,
    • Detect unsupported property types and either skip them or throw a validation exception,
    • Support eg a custom attribute to skip mapping of particular properties (such as those you have no intention of supporting).
  2. List Population: This assumes that each of your lists are initialized and populated with at least one object. The GetValuesFromObject() method currently relies on assessing properties from each object instance to increment the columnIndex property; To support null or empty lists, it would need a more sophisticated way of tracking which property corresponds to which position in the array.

  3. Inheritance: As with your original code, this only looks at the top-level properties of your model. That will introduce problems if you want your models to inherit from other models (eg, if Customer inherits from Person). Adding support for inherited properties is trivial, but if you inherit from types in the base class library, you may find that includes some properties you don’t want mapped.

None of these limitations will affect your sample model, nor will they be a concern if you write your model with them in mind. If you’re working with a far more complex, established model, however, you’ll likely want to improve my code to address these limitations. Addressing would make this already complex code those much more involved, however, so I’m them for you to address in leaving your implementation.

Conclusion

This should give you a good idea of ​​how to dynamically map an object graph into a flattened DataTablewhile providing a solid foundation to build off of based on the specific requirements of your model.

Leave a Comment