Resolved Compare two xml files and create new

PD1991

Member
Joined
Nov 17, 2021
Messages
18
Programming Experience
1-3
Hello,

I am creating two xml files by serializing xml class object and I want to compare both the files and create a new one.
Is there any simpler way to do so.
Eg.
* Case 1
C#:
File1.xml
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
<Group>
<Name>Grp2</Name>
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp3</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>

File2.xml
C#:
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp3</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>

Output.xml
C#:
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
<Group>
<Name>Grp2_ToBeDeleted</Name> //Rename the node as it don't exists in File1.xml
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp3</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>

* Case 2
File1.xml
C#:
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp4</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>

File2.xml
C#:
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
<Group>
<Name>Grp2</Name>
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp3</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>

Output.xml
C#:
<Model>
<ChildGroups>
<Group>
<Name>Grp1</Name>
</Group>
<Group>
<Name>Grp2</Name> //Add Grp2 as it exists in File2.xml
</Group>
</ChildGroups>
<ChildGroups>
<Group>
<Name>Grp3</Name>
</Group>
<Group>
<Name>Grp4</Name>
</Group>
</ChildGroups>
</Model>
 
Last edited by a moderator:
Solution
I think I have understood the question and requirements:
  • compare groups by hierarchy, identify group by child Name value
  • start with file2 (modify as we go), remove groups that doesn't exist in file1
  • add extra groups from file1 to file2
  • save as new output file

Based on the sample files I wrote this example code:
C#:
//using System.Xml.Linq;
//using System.Xml.XPath;

void MergeXml()
{
    var doc1 = XDocument.Load(@"C:\Users\xylo\Downloads\Files\File1.xml");
    var doc2 = XDocument.Load(@"C:\Users\xylo\Downloads\Files\File2.xml");

    //start with file2 (modify as we go), remove groups that doesn't exist in file1
    foreach (var group in doc2.Descendants("Group").ToArray())
    {
        var xpath =...
Perhaps I'm missing something, but how does the code in post #13 take into account which <ChildGroups> that is found in doc1 to be added into the correct <ChildGroups> into doc2 ?

For example:
File1:
<Model>
    <ChildGroups>
        <Group>
            <Name>grp1</Name>
        </Group>
    </ChildGroups>
    <ChildGroups>
        <Group>
            <Name>grp2</Name>
        </Group>
    </ChildGroups>
</Model>

File2.xml:
<Model>
    <ChildGroups>
        <Group>
            <Name>grp1</Name>
        </Group>
    </ChildGroups>
</Model>

I would expect a new <ChildGroups> node to be created, but my reading of the code in post #13 is that it will stick "grp2" into the first and only <ChildGroups> of File2.xml.
 
Yes, because ChildGroups is only child of Group, and each group has only one as container for its child groups, and Group is identified by Name and place in hierarchy by xpath. If a Group is not in a ChildGroup it is added to "toplevel" //Model/Groups.
I attach a filtered "file1" with only layout and names where it is easy to see this.
 

Attachments

  • File1-filter.zip
    17.7 KB · Views: 22
Last edited:
So you just update the Name element. In place of line 15:
C#:
group.Element("Name").Value += "_ToBeDeleted";
Insert at line 25:
C#:
group.Element("Name").Value += "_New";
Can we perform the modification for only specific 'Model' as In my case their can also be a case where Models in FIle1 can be multiple so I want to modify only for the Model name which exists in Fie2. File2 will always have a single Model.
 
Yes, get Model/Name from file2 (thename), then in the part that get groups from file1 (to add) only get groups from that model.
In essence change doc1.Descendants("Group") to doc1.XPathSelectElement("//Model[Name='thename']").Descendants("Group")
Similar for the part that looks up file1 (to remove) modify the xpath to select model by name also.
 
Yes, get Model/Name from file2 (thename), then in the part that get groups from file1 (to add) only get groups from that model.
In essence change doc1.Descendants("Group") to doc1.XPathSelectElement("//Model[Name='thename']").Descendants("Group")
Similar for the part that looks up file1 (to remove) modify the xpath to select model by name also.
ToBedeleted is added to other groups belonging to other model as well.
Below is the code for reference.

C#:
var doc1 = XDocument.Load(Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), "EquipmentModel.xml"));
                            var doc2 = XDocument.Load(exportFileName);

                            //start with file2 (modify as we go), remove groups that doesn't exist in file1
                            foreach (var group in doc2.Descendants("Group").ToArray())
                            {
                                var xpath = GetXpathByChildName(group);
                                var other = doc1.XPathSelectElement(xpath.Replace("_ToBeDeleted", string.Empty));
                                if (other == null && !group.Element("Name").Value.EndsWith("_ToBeDeleted"))
                                {
                                    group.Element("Name").Value += "_ToBeDeleted";
                                }
                                else if(other != null && xpath.Contains("_ToBeDeleted"))
                                {
                                    group.Element("Name").Value = group.Element("Name").Value.Replace("_ToBeDeleted", string.Empty);
                                }
                            }

                            //add extra groups from file1 to file2
                            foreach (var group in doc1.XPathSelectElement("//Model[Name='" + modelName +"']").Descendants("Group").ToArray())
                            {
                                var xpath = GetXpathByChildName(group);
                                var other = doc2.XPathSelectElement(xpath);
                                if (other == null)
                                {
                                    var parentgroup = group.Ancestors("Group").FirstOrDefault();
                                    if (parentgroup == null)
                                    {
                                        doc2.XPathSelectElement("//Model/Groups").Add(group);
                                    }
                                    else
                                    {
                                        var parentXpath = GetXpathByChildName(parentgroup);
                                        var otherparent = doc2.XPathSelectElement(parentXpath);
                                        otherparent.Element("ChildGroups").Add(group);
                                    }
                                }
                            }
                            //save as output
                            doc2.Save(Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), "Final.xml"));
 
Not if you do this:
Similar for the part that looks up file1 (to remove) modify the xpath to select model by name also.
I would probably write a GetXpathByModelNameChildName method for this, for example:
C#:
string GetXpathByModelNameChildName(XElement element, string modelname)
{
    var name = element.Element("Name").Value;
    var xpath = string.Join("/", element.AncestorsAndSelf().Reverse().Select(a => a.Name.LocalName).ToArray()) + $"[Name='{name}']";
    return xpath.Replace("/Model/", $"/Model[Name='{modelname}']/");
}
 
Not if you do this:

I would probably write a GetXpathByModelNameChildName method for this, for example:
C#:
string GetXpathByModelNameChildName(XElement element, string modelname)
{
    var name = element.Element("Name").Value;
    var xpath = string.Join("/", element.AncestorsAndSelf().Reverse().Select(a => a.Name.LocalName).ToArray()) + $"[Name='{name}']";
    return xpath.Replace("/Model/", $"/Model[Name='{modelname}']/");
}
No still the same. Another model groups is still affected.

C#:
                            var doc1 = XDocument.Load(Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), "EquipmentModel.xml"));
                            var doc2 = XDocument.Load(exportFileName);

                            //start with file2 (modify as we go), remove groups that doesn't exist in file1
                            foreach (var group in doc2.Descendants("Group").ToArray())
                            {
                                var xpath = GetXpathByModelNameChildName(group, modelName);
                                var other = doc1.XPathSelectElement(xpath.Replace("_ToBeDeleted", string.Empty));
                                if (other == null && !group.Element("Name").Value.EndsWith("_ToBeDeleted"))
                                {
                                    group.Element("Name").Value += "_ToBeDeleted";
                                }
                                else if(other != null && xpath.Contains("_ToBeDeleted"))
                                {
                                    group.Element("Name").Value = group.Element("Name").Value.Replace("_ToBeDeleted", string.Empty);
                                }
                            }

                            //add extra groups from file1 to file2
                            foreach (var group in doc1.XPathSelectElement("//Model[Name='" + modelName +"']").Descendants("Group").ToArray())
                            {
                                var xpath = GetXpathByModelNameChildName(group, modelName);
                                var other = doc2.XPathSelectElement(xpath);
                                if (other == null)
                                {
                                    var parentgroup = group.Ancestors("Group").FirstOrDefault();
                                    if (parentgroup == null)
                                    {
                                        doc2.XPathSelectElement("//Model/Groups").Add(group);
                                    }
                                    else
                                    {
                                        var parentXpath = GetXpathByModelNameChildName(parentgroup, modelName);
                                        var otherparent = doc2.XPathSelectElement(parentXpath);
                                        otherparent.Element("ChildGroups").Add(group);
                                    }
                                }
                            }
                            //save as output
                            doc2.Save(Path.Combine(Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location), "Final.xml"));
 

Attachments

  • MultipleModelFile.zip
    250.3 KB · Views: 9
I don't see how that is possible since file2 has one model and you look up that model in file1.
 
I don't see how that is possible since file2 has one model and you look up that model in file1.
doc1 is the file2 in my case and doc1 is the file I share above with multiple models.
If I pass file1 as doc1 then ToBeDeleted logic is not working. So I reversed where everything works except adding of suffix for specific model.
 
file2 is the starting point and it has one model, only this is modified (mark delete and add extra groups) and finally saved as new output file.
file1 with multiple models is what you compare with (and only for groups for model name from file2).
 
Last edited:
file2 is the starting point and it has one model, only this is modified (mark delete and add extra groups) and finally saved as new output file.
file1 with multiple models is what you compare with (and only for groups for model name from file2).
I modified the code a bit and Its working well now.
C#:
//start with file2 (modify as we go), remove groups that doesn't exist in file1
foreach (var group in doc1.XPathSelectElement("//Model[Name='" + glbFileNameWithoutExtension + "']").Descendants("Group").ToArray())
{
var xpath = GetXpathByChildName(group);
var other = doc2.XPathSelectElement(xpath.Replace("_ToBeDeleted", string.Empty));
if (other == null && !group.Element("Name").Value.EndsWith("_ToBeDeleted"))
{
group.Element("Name").Value += "_ToBeDeleted";
}
 else if (other != null && xpath.Contains("_ToBeDeleted"))
{
group.Element("Name").Value = group.Element("Name").Value.Replace("_ToBeDeleted", string.Empty);
}
}
Now lastly It takes few seconds to write new file. Can we check whether file writing is done or not. If done than import newly created file ?
 
Saving is complete when Save method returns, it is a synchronous method.
 
Yes, because ChildGroups is only child of Group, and each group has only one as container for its child groups, and Group is identified by Name and place in hierarchy by xpath. If a Group is not in a ChildGroup it is added to "toplevel" //Model/Groups.
I attach a filtered "file1" with only layout and names where it is easy to see this.

But if you look at case 1 of post #1, File1.xml there has two <ChildGroups>.
 
It must be a mistake by OP, the structure of files posted later doesn't look like that.
 
Back
Top Bottom