diff --git a/.github/workflows/python-publish.yml b/.github/workflows/python-publish.yml
new file mode 100644
index 0000000..4d213f9
--- /dev/null
+++ b/.github/workflows/python-publish.yml
@@ -0,0 +1,29 @@
+name: Upload Python Package
+
+on:
+ release:
+ types: [published]
+
+permissions:
+ contents: read
+
+jobs:
+ deploy:
+
+ runs-on: ubuntu-latest
+
+ steps:
+ - uses: actions/checkout@v3
+ - name: Set up Python
+ uses: actions/setup-python@v3
+ with:
+ python-version: '3.x'
+ - name: Install dependencies
+ run: |
+ python -m pip install --upgrade pip
+ pip install hatch hatchling
+ - name: Build package
+ run: hatch build
+ - name: Publish package
+ run: |
+ hatch publish -u "__token__" -a ${{ secrets.PYPI_API_TOKEN }}
\ No newline at end of file
diff --git a/.gitignore b/.gitignore
index 075c8ac..b5c8113 100644
--- a/.gitignore
+++ b/.gitignore
@@ -14,7 +14,7 @@ csproj/obj/*
.Python
build/
develop-eggs/
-dist/
+src/python_redlines/data/
downloads/
eggs/
.eggs/
diff --git a/.idea/.gitignore b/.idea/.gitignore
new file mode 100644
index 0000000..13566b8
--- /dev/null
+++ b/.idea/.gitignore
@@ -0,0 +1,8 @@
+# Default ignored files
+/shelf/
+/workspace.xml
+# Editor-based HTTP Client requests
+/httpRequests/
+# Datasource local storage ignored files
+/dataSources/
+/dataSources.local.xml
diff --git a/.idea/inspectionProfiles/profiles_settings.xml b/.idea/inspectionProfiles/profiles_settings.xml
new file mode 100644
index 0000000..105ce2d
--- /dev/null
+++ b/.idea/inspectionProfiles/profiles_settings.xml
@@ -0,0 +1,6 @@
+
+
+
+
+
+
\ No newline at end of file
diff --git a/.idea/misc.xml b/.idea/misc.xml
new file mode 100644
index 0000000..f509063
--- /dev/null
+++ b/.idea/misc.xml
@@ -0,0 +1,7 @@
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/.idea/modules.xml b/.idea/modules.xml
new file mode 100644
index 0000000..fa9bd94
--- /dev/null
+++ b/.idea/modules.xml
@@ -0,0 +1,8 @@
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/.idea/python-redlines.iml b/.idea/python-redlines.iml
new file mode 100644
index 0000000..bf69088
--- /dev/null
+++ b/.idea/python-redlines.iml
@@ -0,0 +1,10 @@
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/.idea/vcs.xml b/.idea/vcs.xml
new file mode 100644
index 0000000..94a25f7
--- /dev/null
+++ b/.idea/vcs.xml
@@ -0,0 +1,6 @@
+
+
+
+
+
+
\ No newline at end of file
diff --git a/LICENSE.md b/LICENSE.md
new file mode 100644
index 0000000..fc0f916
--- /dev/null
+++ b/LICENSE.md
@@ -0,0 +1,9 @@
+MIT License
+
+Copyright (c) 2024-present U.N. Owen
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..712b974
--- /dev/null
+++ b/README.md
@@ -0,0 +1,167 @@
+# Python-Redlines: Docx Redlines (Tracked Changes) for the Python Ecosystem
+
+## Project Goal - Democratizing DOCX Comparisons
+
+The main goal of this project is to address the significant gap in the open-source ecosystem around `.docx` document
+comparison tools. Currently, the process of comparing and generating redline documents (documents that highlight
+changes between versions) is complex and largely dominated by commercial software. These
+tools, while effective, often come with cost barriers and limitations in terms of accessibility and integration
+flexibility.
+
+`Python-redlines` aims to democratize the ability to run tracked change redlines for .docx, providing the
+open-source community with a tool to create `.docx` redlines without the need for commercial software. This will let
+more legal hackers and hobbyist innovators experiment and create tooling for enterprise and legal.
+
+## Project Roadmap
+
+### Step 1. Open-XML-PowerTools `WmlComparer` Wrapper
+
+The [Open-XML-PowerTools](https://github.com/OpenXmlDev/Open-Xml-PowerTools) project historically offered a solid
+foundation for working with `.docx` files and has an excellent (if imperfect) comparison engine in its `WmlComparer`
+class. However, Microsoft archived the repository almost five years ago, and a forked repo is not being actively
+maintained, as its most recent commits dates from 2 years ago and the repo issues list is disabled.
+
+As a first step, our project aims to bring the existing capabilities of WmlCompare into the Python world. Thankfully,
+XML Power Tools is full cross-platform as it is written in .NET and compiles with the still-maintained .NET 8. The
+resulting binaries can be compiled for the latest versions of Windows, OSX and Linux (Ubuntu specifically, though other
+distributions should work fine too). We have included an OSX build but do not have an OSX machine to test on. Please
+report an issues by opening a new Issue.
+
+The initial release has a single engine `XmlPowerToolsEngine`, which is just a Python wrapper for a simple C# utility
+written to leverage WmlComparer for 1-to-1 redlines. We hope this provides a stop-gap capability to Python developers
+seeking .docx redline capabilities.
+
+**Note**, we don't plan to fork or maintain Open-XML-PowerTools. [Version 4.4.0](https://www.nuget.org/packages/Open-Xml-PowerTools/),
+which appears to only be compatible with [Open XML SDK < 3.0.0](https://www.nuget.org/packages/DocumentFormat.OpenXml) works
+for now, it needs to be made compatible with the latest versions of the Open XML SDK to extend its life. **There are
+also some [issues](https://github.com/dotnet/Open-XML-SDK/issues/1634)**, and it seems the only maintainer of
+Open-XML-PowerTools probably won't fix, and understanding the existing code base is no small task. Please be aware that
+**Open XML PowerTools is not a perfect comparison engine, but it will work for many purposes. Use at your own risk.**
+
+### Step 2. Pure Python Comparison Engine
+
+Looking towards the future, rather than reverse engineer `WmlComparer` and maintain a C# codebase, we envision a
+comparison engine written in python. We've done some experimentation with [`xmldiff`](https://github.com/Shoobx/xmldiff)
+as the engine to compare the underlying xml of docx files. Specifically, we've built a prototype to unzip `.docx` files,
+execute an xml comparison using `xmldiff`, and then reconstructed a tracked changes docx with the proper Open XML
+(ooxml) tracked change tags. Preliminary experimentation with this approach has shown promise, indicating its
+feasibility for handling modifications such as simple span inserts and deletes.
+
+However, this ambitious endeavor is not without its challenges. The intricacies of `.docx` files and the potential for
+complex, corner-case scenarios necessitate a thoughtful and thorough development process. In the interim, `WmlComparer`
+is a great solution as it has clearly been built to account for many such corner cases, through a development process
+that clearly was influenced by issues discovered by a large user base. The XMLDiff engine will take some time to reach
+a level of maturity similar to WmlComparer. At the moment it is NOT included.
+
+## Getting started
+
+### Install .NET Core 8
+
+The Open-XML-PowerTools engine we're using in the initial releases requires .NET to run (don't worry, this is very
+well-supported cross-platform at the moment). Our builds are targeting x86-64 Linux and Windows, however, so you'll
+need to modify the build script and build new binaries if you want to target another runtime / architecture.
+
+#### On Linux
+
+You can follow [Microsoft's instructions for your Linux distribution](https://learn.microsoft.com/en-us/dotnet/core/install/linux)
+
+#### On Windows
+
+You can follow [Microsoft's instructions for your Windows vesrion](https://learn.microsoft.com/en-us/dotnet/core/install/windows?tabs=net80)
+
+### Install the Library
+
+At the moment, we are not distributing via pypi. You can easily install directly from this repo, however.
+
+```commandline
+pip install git+https://github.com/JSv4/Python-Redlines
+```
+
+You can add this as a dependency like so
+
+```requirements
+python_redlines @ git+https://github.com/JSv4/Python-Redlines@v0.0.1
+```
+
+### Use the Library
+
+If you just want to use the tool, jump into our [quickstart guide](docs/quickstart.md).
+
+## Architecture Overview
+
+`XmlPowerToolsEngine` is a Python wrapper class for the `redlines` C# command-line tool, source of which is available in
+[./csproj/Program.cs](./csproj/Program.cs). The redlines utility and wrapper let you compare two docx files and
+show the differences in tracked changes (a "redline" document).
+
+### C# Functionality
+
+The `redlines` C# utility is a command line tool that requires four arguments:
+1. `author_tag` - A tag to identify the author of the changes.
+2. `original_path.docx` - Path to the original document.
+3. `modified_path.docx` - Path to the modified document.
+4. `redline_path.docx` - Path where the redlined document will be saved.
+
+The Python wrapper, `XmlPowerToolsEngine` and its main method `run_redline()`, simplifies the use of `redlines` by
+orchestrating its execution with Python and letting you pass in bytes or file paths for the original and modified
+documents.
+
+### Packaging
+
+The project is structured as follows:
+```
+python-redlines/
+│
+├── csproj/
+│ ├── bin/
+│ ├── obj/
+│ ├── Program.cs
+│ ├── redlines.csproj
+│ └── redlines.sln
+│
+├── docs/
+│ ├── developer-guide.md
+│ └── quickstart.md
+│
+├── src/
+│ └── python_redlines/
+│ ├── bin/
+│ │ └── .gitignore
+│ ├── dist/
+│ │ ├── .gitignore
+│ │ ├── linux-x64-0.0.1.tar.gz
+│ │ └── win-x64-0.0.1.zip
+│ ├── __about__.py
+│ ├── __init__.py
+│ └── engines.py
+│
+├── tests/
+| ├── fixtures/
+| ├── test_openxml_differ.py
+| └── __init__.py
+|
+├── .gitignore
+├── build_differ.py
+├── extract_version.py
+├── License.md
+├── pyproject.toml
+└── README.md
+```
+
+- `src/your_package/`: Contains the Python wrapper code.
+- `dist/`: Contains the zipped C# binaries for different platforms.
+- `bin/`: Target directory for extracted binaries.
+- `tests/`: Contains test cases and fixtures for the wrapper.
+
+### Detailed Explanation and Dev Setup
+
+If you want to contribute to the library or want to dive into some of the C# packaging architecture, go to our
+[developer guide](docs/developer-guide.md).
+
+## Additional Information
+
+- **Contributing**: Contributions to the project should follow the established coding and documentation standards.
+- **Issues and Support**: For issues, feature requests, or support, please use the project's issue tracker on GitHub.
+
+## License
+
+MIT
diff --git a/build_differ.py b/build_differ.py
new file mode 100644
index 0000000..0c6ab53
--- /dev/null
+++ b/build_differ.py
@@ -0,0 +1,109 @@
+import subprocess
+import os
+import tarfile
+import zipfile
+
+
+def get_version():
+ """
+ Extracts the version from the specified __about__.py file.
+ """
+ about = {}
+ with open('./src/python_redlines/__about__.py') as f:
+ exec(f.read(), about)
+ return about['__version__']
+
+
+def run_command(command):
+ """
+ Runs a shell command and prints its output.
+ """
+ process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+ for line in process.stdout:
+ print(line.decode().strip())
+
+
+def compress_files(source_dir, target_file):
+ """
+ Compresses files in the specified directory into a tar.gz or zip file.
+ """
+ if target_file.endswith('.tar.gz'):
+ with tarfile.open(target_file, "w:gz") as tar:
+ tar.add(source_dir, arcname=os.path.basename(source_dir))
+ elif target_file.endswith('.zip'):
+ with zipfile.ZipFile(target_file, 'w', zipfile.ZIP_DEFLATED) as zipf:
+ for root, dirs, files in os.walk(source_dir):
+ for file in files:
+ zipf.write(os.path.join(root, file),
+ os.path.relpath(os.path.join(root, file),
+ os.path.join(source_dir, '..')))
+
+
+def cleanup_old_builds(dist_dir, current_version):
+ """
+ Deletes any build files ending in .zip or .tar.gz in the dist_dir with a different version tag.
+ """
+ for file in os.listdir(dist_dir):
+ if not file.endswith((f'{current_version}.zip', f'{current_version}.tar.gz', '.gitignore')):
+ file_path = os.path.join(dist_dir, file)
+ os.remove(file_path)
+ print(f"Deleted old build file: {file}")
+
+
+def main():
+ version = get_version()
+ print(f"Version: {version}")
+
+ dist_dir = "./src/python_redlines/dist/"
+
+ # Build for Linux x64
+ print("Building for Linux x64...")
+ run_command('dotnet publish ./csproj -c Release -r linux-x64 --self-contained')
+
+ # Build for Linux ARM64
+ print("Building for Linux ARM64...")
+ run_command('dotnet publish ./csproj -c Release -r linux-arm64 --self-contained')
+
+ # Build for Windows x64
+ print("Building for Windows x64...")
+ run_command('dotnet publish ./csproj -c Release -r win-x64 --self-contained')
+
+ # Build for Windows ARM64
+ print("Building for Windows ARM64...")
+ run_command('dotnet publish ./csproj -c Release -r win-arm64 --self-contained')
+
+ # Build for macOS x64
+ print("Building for macOS x64...")
+ run_command('dotnet publish ./csproj -c Release -r osx-x64 --self-contained')
+
+ # Build for macOS ARM64
+ print("Building for macOS ARM64...")
+ run_command('dotnet publish ./csproj -c Release -r osx-arm64 --self-contained')
+
+ # Compress the Linux x64 build
+ linux_x64_build_dir = './csproj/bin/Release/net8.0/linux-x64'
+ compress_files(linux_x64_build_dir, f"{dist_dir}/linux-x64-{version}.tar.gz")
+
+ # Compress the Linux ARM64 build
+ linux_arm64_build_dir = './csproj/bin/Release/net8.0/linux-arm64'
+ compress_files(linux_arm64_build_dir, f"{dist_dir}/linux-arm64-{version}.tar.gz")
+
+ # Compress the Windows x64 build
+ windows_build_dir = './csproj/bin/Release/net8.0/win-x64'
+ compress_files(windows_build_dir, f"{dist_dir}/win-x64-{version}.zip")
+
+ # Compress the macOS x64 build
+ macos_x64_build_dir = './csproj/bin/Release/net8.0/osx-x64'
+ compress_files(macos_x64_build_dir, f"{dist_dir}/osx-x64-{version}.tar.gz")
+
+ # Compress the macOS ARM64 build
+ macos_arm64_build_dir = './csproj/bin/Release/net8.0/osx-arm64'
+ compress_files(macos_arm64_build_dir, f"{dist_dir}/osx-arm64-{version}.tar.gz")
+
+ cleanup_old_builds(dist_dir, version)
+
+ print("Build and compression complete.")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/csproj/Program.cs b/csproj/Program.cs
new file mode 100644
index 0000000..9a73417
--- /dev/null
+++ b/csproj/Program.cs
@@ -0,0 +1,57 @@
+using System;
+using System.IO;
+using Clippit;
+using Clippit.Word;
+using DocumentFormat.OpenXml.Packaging;
+
+class Program
+{
+ static void Main(string[] args)
+ {
+ if (args.Length != 4)
+ {
+ Console.WriteLine("Usage: redlines ");
+ return;
+ }
+
+ string authorTag = args[0];
+ string originalFilePath = args[1];
+ string modifiedFilePath = args[2];
+ string outputFilePath = args[3];
+
+ if (!File.Exists(originalFilePath) || !File.Exists(modifiedFilePath))
+ {
+ Console.WriteLine("Error: One or both files do not exist.");
+ return;
+ }
+
+ try
+ {
+ var originalBytes = File.ReadAllBytes(originalFilePath);
+ var modifiedBytes = File.ReadAllBytes(modifiedFilePath);
+ var originalDocument = new WmlDocument(originalFilePath, originalBytes);
+ var modifiedDocument = new WmlDocument(modifiedFilePath, modifiedBytes);
+
+ var comparisonSettings = new WmlComparerSettings
+ {
+ AuthorForRevisions = authorTag,
+ DetailThreshold = 0
+ };
+
+ var comparisonResults = WmlComparer.Compare(originalDocument, modifiedDocument, comparisonSettings);
+ var revisions = WmlComparer.GetRevisions(comparisonResults, comparisonSettings);
+
+ // Output results
+ Console.WriteLine($"Revisions found: {revisions.Count}");
+
+ File.WriteAllBytes(outputFilePath, comparisonResults.DocumentByteArray);
+ }
+ catch (Exception ex)
+ {
+ Console.WriteLine($"Error: {ex.Message}");
+ Console.WriteLine("Detailed Stack Trace:");
+ Console.WriteLine(ex.StackTrace);
+ }
+ }
+}
+
diff --git a/csproj/bin/.gitignore b/csproj/bin/.gitignore
deleted file mode 100644
index f59ec20..0000000
--- a/csproj/bin/.gitignore
+++ /dev/null
@@ -1 +0,0 @@
-*
\ No newline at end of file
diff --git a/csproj/redlines.csproj b/csproj/redlines.csproj
new file mode 100644
index 0000000..3ea6f1d
--- /dev/null
+++ b/csproj/redlines.csproj
@@ -0,0 +1,14 @@
+
+
+
+ Exe
+ net8.0
+ enable
+ enable
+
+
+
+
+
+
+
diff --git a/csproj/redlines.sln b/csproj/redlines.sln
new file mode 100644
index 0000000..c3d3f0a
--- /dev/null
+++ b/csproj/redlines.sln
@@ -0,0 +1,25 @@
+
+Microsoft Visual Studio Solution File, Format Version 12.00
+# Visual Studio Version 17
+VisualStudioVersion = 17.5.002.0
+MinimumVisualStudioVersion = 10.0.40219.1
+Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "redlines", "redlines.csproj", "{ABB1058B-B929-49BE-BFA7-3D93D8C7BFEA}"
+EndProject
+Global
+ GlobalSection(SolutionConfigurationPlatforms) = preSolution
+ Debug|Any CPU = Debug|Any CPU
+ Release|Any CPU = Release|Any CPU
+ EndGlobalSection
+ GlobalSection(ProjectConfigurationPlatforms) = postSolution
+ {ABB1058B-B929-49BE-BFA7-3D93D8C7BFEA}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
+ {ABB1058B-B929-49BE-BFA7-3D93D8C7BFEA}.Debug|Any CPU.Build.0 = Debug|Any CPU
+ {ABB1058B-B929-49BE-BFA7-3D93D8C7BFEA}.Release|Any CPU.ActiveCfg = Release|Any CPU
+ {ABB1058B-B929-49BE-BFA7-3D93D8C7BFEA}.Release|Any CPU.Build.0 = Release|Any CPU
+ EndGlobalSection
+ GlobalSection(SolutionProperties) = preSolution
+ HideSolutionNode = FALSE
+ EndGlobalSection
+ GlobalSection(ExtensibilityGlobals) = postSolution
+ SolutionGuid = {EE9F0B5D-33E8-4477-BA5D-1C8F3EAA5CD8}
+ EndGlobalSection
+EndGlobal
diff --git a/dist/.gitignore b/dist/.gitignore
new file mode 100644
index 0000000..c96a04f
--- /dev/null
+++ b/dist/.gitignore
@@ -0,0 +1,2 @@
+*
+!.gitignore
\ No newline at end of file
diff --git a/docs/developer-guide.md b/docs/developer-guide.md
new file mode 100644
index 0000000..3dee299
--- /dev/null
+++ b/docs/developer-guide.md
@@ -0,0 +1,103 @@
+# Developer Guide for RedlinesWrapper
+
+## Prerequisites
+
+- Python 3.7 or higher installed
+- .NET SDK for building C# binaries or .NET Runtime to run them
+- Hatch for Python environment and package management
+
+## Setting Up the Project
+
+### Step 1: Clone the Repository
+
+Clone the Python-Docx-Redlines repository to your local
+
+machine. Use Git to clone the repository using the following command:
+
+```bash
+git clone https://github.com/JSv4/Python-Docx-Redlines
+cd python-docx-redlines
+```
+
+### Step 2: Install Hatch
+
+If Hatch is not already installed, install it using pip:
+
+```bash
+pip install hatch hatchling
+```
+
+### Step 3: Create and Activate the Virtual Environment
+
+Inside the project directory, create a virtual environment using Hatch:
+
+```bash
+hatch env create
+```
+
+Activate the virtual environment:
+
+```bash
+hatch shell
+```
+
+### Step 4: Install Dependencies
+
+Install the necessary Python dependencies:
+
+```bash
+pip install .[dev]
+```
+
+## Building the C# Binaries
+
+You can use the binaries distributed with the project, or, if you want to build new binaries for some reason, you can
+use our build script, integrated as a hatch tool.
+
+```bash
+hatch run build
+```
+
+### Under the Hood
+
+We're just using dotnet to build binaries for [Program.cs](csproj/Program.cs), a command line utility that exposes
+`WmlComparer's` redlining capabilities. We are currently target win-x64 and linux-x64 builds, but any runtime
+[supported by .NET](https://learn.microsoft.com/en-us/dotnet/core/rid-catalog) is theoretically supported.
+
+**Our build script does the following:**
+
+1. Build a binary for Linux:
+
+```bash
+dotnet publish -c Release -r linux-x64 --self-contained
+```
+
+2. Build a binary for Windows:
+
+```bash
+dotnet publish -c Release -r win-x64 --self-contained
+```
+
+3. Build a binary for MacOS:
+
+```bash
+dotnet publish -c Release -r osx-x64 --self-contained
+```
+
+4. Archive and package binaries into `./dist/`:
+
+
+## Running Tests
+
+To ensure everything is set up correctly and working as expected, run the tests included in the `tests/` directory.
+Execute the tests using pytest:
+
+```bash
+pytest
+```
+
+This will run all the test cases defined in your test files.
+
+## Conclusion
+
+You've now set up the Python-Docx-Redlines project, built the necessary C# binaries, and learned how to use the Python wrapper to compare `.docx` files. Running the tests ensures that your setup is correct and the wrapper functions as expected.
diff --git a/docs/quickstart.md b/docs/quickstart.md
new file mode 100644
index 0000000..2c74b80
--- /dev/null
+++ b/docs/quickstart.md
@@ -0,0 +1,44 @@
+# Python-Redlines Quickstart
+
+As discussed in the main README, the initial version is a wrapper for the C# api provided by Open-XML-PowerTools and
+`WmlComparer`. This readme will show you how to use the XmlPowerToolsEngine to run a redline.
+
+### Step 1: Import and Initialize the Wrapper
+
+In your Python script or interactive session, import and initialize the wrapper:
+
+```python
+from python_redlines.engines import XmlPowerToolsEngine
+
+wrapper = XmlPowerToolsEngine()
+```
+
+### Step 2: Run Redlines
+
+Use the `run_redline` method to compare documents. You can pass the paths of the `.docx` files or their byte content:
+
+```python
+# Example with file paths
+output = wrapper.run_redline('AuthorTag', '/path/to/original.docx', '/path/to/modified.docx')
+
+# Example with byte content
+with open('/path/to/original.docx', 'rb') as f:
+ original_bytes = f.read()
+with open('/path/to/modified.docx', 'rb') as f:
+ modified_bytes = f.read()
+
+# This is a tuple, bytes @ element 0
+output = wrapper.run_redline('AuthorTag', original_bytes, modified_bytes)
+```
+
+In both cases, `output` will contain the byte content of the resulting redline - a .docx with changes in tracked
+changes.
+
+### Step 3: Handle the Output
+
+Process or save the output as needed. For example, to save the redline output to a file:
+
+```python
+with open('/path/to/redline_output.docx', 'wb') as f:
+ f.write(output[0])
+```
diff --git a/extract_version.py b/extract_version.py
new file mode 100644
index 0000000..71226d3
--- /dev/null
+++ b/extract_version.py
@@ -0,0 +1,12 @@
+# extract_version.py
+def get_version():
+ """
+ Extracts the version from the specified __about__.py file.
+ """
+ about = {}
+ with open('./src/python_redlines/__about__.py') as f:
+ exec(f.read(), about)
+ return about['__version__']
+
+if __name__ == "__main__":
+ print(get_version())
diff --git a/hatch_run_build_hook.py b/hatch_run_build_hook.py
new file mode 100644
index 0000000..f20b2b1
--- /dev/null
+++ b/hatch_run_build_hook.py
@@ -0,0 +1,9 @@
+import subprocess
+from hatchling.builders.hooks.plugin.interface import BuildHookInterface
+
+class HatchRunBuildHook(BuildHookInterface):
+ PLUGIN_NAME = 'hatch-run-build'
+
+ def initialize(self, version, build_data):
+ # Run the 'hatch run build' command
+ subprocess.run(["python", "-m", "build_differ"], check=True)
\ No newline at end of file
diff --git a/pyproject.toml b/pyproject.toml
new file mode 100644
index 0000000..4494c19
--- /dev/null
+++ b/pyproject.toml
@@ -0,0 +1,103 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+
+[tool.hatch.build.targets.wheel.force-include]
+"dist" = "python_redlines/dist"
+
+[tool.hatch.build.targets.wheel]
+artifacts = [
+ "*.zip",
+ "*.tar.gz",
+]
+[tool.hatch.build.targets.sdist]
+include = [
+ "python_redlines/dist",
+ "python_redlines/bin",
+]
+
+# Build hook to build the binaries for distribution...
+[tool.hatch.build.hooks.custom]
+path = "hatch_run_build_hook.py"
+
+[project]
+name = "python-redlines"
+dynamic = ["version"]
+description = ''
+readme = "README.md"
+requires-python = ">=3.8"
+license = "MIT"
+keywords = []
+authors = [
+ { name = "John Scrudato IV" },
+]
+classifiers = [
+ "Development Status :: 4 - Beta",
+ "Programming Language :: Python",
+ "Programming Language :: Python :: 3.8",
+ "Programming Language :: Python :: 3.9",
+ "Programming Language :: Python :: 3.10",
+ "Programming Language :: Python :: 3.11",
+ "Programming Language :: Python :: 3.12",
+ "Programming Language :: Python :: Implementation :: CPython",
+ "Programming Language :: Python :: Implementation :: PyPy",
+]
+dependencies = [
+ "hatch",
+ "hatchling"
+]
+
+[project.urls]
+Documentation = "https://github.com/unknown/python-redlines#readme"
+Issues = "https://github.com/unknown/python-redlines/issues"
+Source = "https://github.com/unknown/python-redlines"
+
+[tool.hatch.version]
+path = "src/python_redlines/__about__.py"
+
+[tool.hatch.envs.default]
+dependencies = [
+ "coverage[toml]>=6.5",
+ "pytest",
+]
+[tool.hatch.envs.default.scripts]
+test = "pytest {args:tests}"
+test-cov = "coverage run -m pytest {args:tests}"
+cov-report = [
+ "- coverage combine",
+ "coverage report",
+]
+cov = [
+ "test-cov",
+ "cov-report",
+]
+build = "python -m build_differ"
+
+[[tool.hatch.envs.all.matrix]]
+python = ["3.8", "3.9", "3.10", "3.11", "3.12"]
+
+[tool.hatch.envs.types]
+dependencies = [
+ "mypy>=1.0.0",
+]
+[tool.hatch.envs.types.scripts]
+check = "mypy --install-types --non-interactive {args:src/python_redlines tests}"
+
+[tool.coverage.run]
+source_pkgs = ["python_redlines", "tests"]
+branch = true
+parallel = true
+omit = [
+ "src/python_redlines/__about__.py",
+]
+
+[tool.coverage.paths]
+python_redlines = ["src/python_redlines", "*/python-redlines/src/python_redlines"]
+tests = ["tests", "*/python-redlines/tests"]
+
+[tool.coverage.report]
+exclude_lines = [
+ "no cov",
+ "if __name__ == .__main__.:",
+ "if TYPE_CHECKING:",
+]
diff --git a/src/python_redlines/__about__.py b/src/python_redlines/__about__.py
new file mode 100644
index 0000000..1d20296
--- /dev/null
+++ b/src/python_redlines/__about__.py
@@ -0,0 +1,4 @@
+# SPDX-FileCopyrightText: 2024-present U.N. Owen
+#
+# SPDX-License-Identifier: MIT
+__version__ = "0.0.5"
diff --git a/src/python_redlines/__init__.py b/src/python_redlines/__init__.py
new file mode 100644
index 0000000..ec15624
--- /dev/null
+++ b/src/python_redlines/__init__.py
@@ -0,0 +1,3 @@
+# SPDX-FileCopyrightText: 2024-present U.N. Owen
+#
+# SPDX-License-Identifier: MIT
diff --git a/src/python_redlines/bin/.gitignore b/src/python_redlines/bin/.gitignore
new file mode 100644
index 0000000..c96a04f
--- /dev/null
+++ b/src/python_redlines/bin/.gitignore
@@ -0,0 +1,2 @@
+*
+!.gitignore
\ No newline at end of file
diff --git a/src/python_redlines/dist/.gitignore b/src/python_redlines/dist/.gitignore
new file mode 100644
index 0000000..c96a04f
--- /dev/null
+++ b/src/python_redlines/dist/.gitignore
@@ -0,0 +1,2 @@
+*
+!.gitignore
\ No newline at end of file
diff --git a/src/python_redlines/engines.py b/src/python_redlines/engines.py
new file mode 100644
index 0000000..be80512
--- /dev/null
+++ b/src/python_redlines/engines.py
@@ -0,0 +1,136 @@
+import subprocess
+import tempfile
+import os
+import platform
+import logging
+import zipfile
+import tarfile
+from pathlib import Path
+from typing import Union, Tuple, Optional
+
+from .__about__ import __version__
+
+logger = logging.getLogger(__name__)
+
+
+class XmlPowerToolsEngine(object):
+ def __init__(self, target_path: Optional[str] = None):
+ self.target_path = target_path
+ self.extracted_binaries_path = self.__unzip_binary()
+
+ def __unzip_binary(self):
+ """
+ Unzips the appropriate C# binary for the current platform.
+ """
+ base_path = os.path.dirname(__file__)
+ binaries_path = os.path.join(base_path, 'dist')
+ target_path = self.target_path if self.target_path else os.path.join(base_path, 'bin')
+
+ if not os.path.exists(target_path):
+ os.makedirs(target_path)
+
+ # Get the binary name and zip name based on the OS and architecture
+ binary_name, zip_name = self.__get_binaries_info()
+
+ # Check if the binary already exists. If not, extract it.
+ full_binary_path = os.path.join(target_path, binary_name)
+
+ if not os.path.exists(full_binary_path):
+ zip_path = os.path.join(binaries_path, zip_name)
+ self.__extract_binary(zip_path, target_path)
+
+ return os.path.join(target_path, binary_name)
+
+ def __extract_binary(self, zip_path: str, target_path: str):
+ """
+ Extracts the binary from the zip file based on the extension. Supports .zip and .tar.gz files
+ :parameter
+ zip_path: str - The path to the zip file
+ target_path: str - The path to extract the binary to
+ """
+ if zip_path.endswith('.zip'):
+ with zipfile.ZipFile(zip_path, 'r') as zip_ref:
+ zip_ref.extractall(target_path)
+
+ elif zip_path.endswith('.tar.gz'):
+ with tarfile.open(zip_path, 'r:gz') as tar_ref:
+ tar_ref.extractall(target_path)
+
+ def __get_binaries_info(self):
+ """
+ Returns the binary name and zip name based on the OS and architecture
+ :return
+ binary_name: str - The name of the binary file
+ zip_name: str - The name of the zip file
+ """
+ os_name = platform.system().lower()
+ arch = platform.machine().lower()
+
+ if arch in ('x86_64', 'amd64'):
+ arch = 'x64'
+ elif arch in ('arm64', 'aarch64'):
+ arch = 'arm64'
+ else:
+ raise EnvironmentError(f"Unsupported architecture: {arch}")
+
+ if os_name == 'linux':
+ zip_name = f"linux-{arch}-{__version__}.tar.gz"
+ binary_name = f'linux-{arch}/redlines'
+
+ elif os_name == 'windows':
+ zip_name = f"win-{arch}-{__version__}.zip"
+ binary_name = f'win-{arch}/redlines.exe'
+
+ elif os_name == 'darwin':
+ zip_name = f"osx-{arch}-{__version__}.tar.gz"
+ binary_name = f'osx-{arch}/redlines'
+
+ else:
+ raise EnvironmentError("Unsupported OS")
+
+ return binary_name, zip_name
+
+ def run_redline(self, author_tag: str, original: Union[bytes, Path], modified: Union[bytes, Path]) \
+ -> Tuple[bytes, Optional[str], Optional[str]]:
+ """
+ Runs the redlines binary. The 'original' and 'modified' arguments can be either bytes or file paths.
+ Returns the redline output as bytes.
+ """
+ temp_files = []
+ try:
+
+ target_path = tempfile.NamedTemporaryFile(delete=False).name
+ original_path = self._write_to_temp_file(original) if isinstance(original, bytes) else original
+ modified_path = self._write_to_temp_file(modified) if isinstance(modified, bytes) else modified
+ temp_files.extend([target_path, original_path, modified_path])
+
+ command = [self.extracted_binaries_path, author_tag, original_path, modified_path, target_path]
+
+ # Capture stdout and stderr
+ result = subprocess.run(command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
+
+ stdout_output = result.stdout if isinstance(result.stdout, str) and len(result.stdout) > 0 else None
+ stderr_output = result.stderr if isinstance(result.stderr, str) and len(result.stderr) > 0 else None
+
+ redline_output = Path(target_path).read_bytes()
+
+ return redline_output, stdout_output, stderr_output
+
+ finally:
+ self._cleanup_temp_files(temp_files)
+
+ def _cleanup_temp_files(self, temp_files):
+ for file_path in temp_files:
+ try:
+ os.remove(file_path)
+ except OSError as e:
+ print(f"Error deleting temp file {file_path}: {e}")
+
+ def _write_to_temp_file(self, data):
+ """
+ Writes bytes data to a temporary file and returns the file path.
+ """
+ temp_file = tempfile.NamedTemporaryFile(delete=False)
+ temp_file.write(data)
+ temp_file.close()
+ return temp_file.name
diff --git a/tests/__init__.py b/tests/__init__.py
new file mode 100644
index 0000000..ec15624
--- /dev/null
+++ b/tests/__init__.py
@@ -0,0 +1,3 @@
+# SPDX-FileCopyrightText: 2024-present U.N. Owen
+#
+# SPDX-License-Identifier: MIT
diff --git a/tests/fixtures/expected_redline.docx b/tests/fixtures/expected_redline.docx
new file mode 100644
index 0000000..be5b51c
Binary files /dev/null and b/tests/fixtures/expected_redline.docx differ
diff --git a/tests/fixtures/modified.docx b/tests/fixtures/modified.docx
new file mode 100644
index 0000000..6b1478c
Binary files /dev/null and b/tests/fixtures/modified.docx differ
diff --git a/tests/fixtures/original.docx b/tests/fixtures/original.docx
new file mode 100644
index 0000000..0d074b4
Binary files /dev/null and b/tests/fixtures/original.docx differ
diff --git a/tests/test_openxml_differ.py b/tests/test_openxml_differ.py
new file mode 100644
index 0000000..2305bb2
--- /dev/null
+++ b/tests/test_openxml_differ.py
@@ -0,0 +1,40 @@
+import os
+import pytest
+from unittest.mock import patch, MagicMock
+
+from python_redlines.engines import XmlPowerToolsEngine
+
+
+def load_docx_bytes(file_path):
+ # Handle relative paths from test directory
+ if not os.path.isabs(file_path):
+ file_path = os.path.join(os.path.dirname(__file__), file_path)
+ with open(file_path, 'rb') as file:
+ return file.read()
+
+
+@pytest.fixture
+def original_docx():
+ return load_docx_bytes('fixtures/original.docx')
+
+
+@pytest.fixture
+def modified_docx():
+ return load_docx_bytes('fixtures/modified.docx')
+
+
+def test_run_redlines_with_real_files(original_docx, modified_docx):
+ # Create an instance of the wrapper
+ wrapper = XmlPowerToolsEngine()
+
+ author_tag = "TestAuthor"
+
+ # Running the wrapper function with actual file bytes
+ redline_output, stdout, stderr = wrapper.run_redline(author_tag, original_docx, modified_docx)
+
+ # Asserting that some output is generated (specific assertions depend on expected output)
+ assert redline_output is not None
+ assert isinstance(redline_output, bytes)
+ assert len(redline_output) > 0
+ assert stderr is None
+ assert "Revisions found: 9" in stdout