forked from dotnet/machinelearning
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
TensorFlowMapper transform for scoring Tensorflow models in ML.NET (d…
…otnet#704) * creating dummy file to test permissions. will remove * test * TensorFlow scoring, from Zeeshan A.'s branch, with some additional changes. * creating dummy file to test permissions. will remove * test * TensorFlow scoring, from Zeeshan A.'s branch, with some additional changes. * taking care of review comments; build fixes * simple change intended to trigger fresh builds to repro OSX-Release build failure * Prevent input tensors from being GC'ed before TF_SessionRun is called * Remove Tensorflow models from tests data Instead bring these from a nuget package created in another repo. For now this is https://github.com/ericstj/machinelearning-testdata-temp Soon it will be https://github.com/dotnet/machinelearning-testdata * Add entry point * Fix manifest and generated C# API. * Create a redist package for tensorflow Create a nuget package that redistributes the TensorFlow C-API. This is needed because TensorFlow doesn't ship an official NuGet package. This is a straight up repack of the bits published on tensorflow.org. I made sure to apply the TensorFlow license to this package and not sign it with our authenticate certificates. * Make TF redist project a normal MSBuild project Remove the use of the SDK targets, and define our own build and clean. * Add TF License file to package * Don't use fullpaths when un-tar'ing Tar on windows was failing when msbuild passed it a full path. Workaround by using relative paths and running where we extract to. * Add some logging to TF redist project * Fix casing of TF redist proj * Change tests to use redistributed TensorFlow Also modify TF binding code to use `tensorflow` in its DLLImports. The runtime will still add the approriate prefix/extension on linux/mac. * Fix mac / linux tensorflow redist I was missing the libtensorflow_framework dependency which caused mac and linux tests to fail. After fixing that, mac still failed due to inability to load libtensorflow_framework.so. We have to rename these to .dylib to satisfy the CORECLR dllimport convention which broke the internal rpath in libtensorflow which pointed at @rpath/libtensorflow_framework.so. Fix this by rewriting the renamed libtensorflow.dylib on mac. Since this operation can only be done on mac, I had to change the build of the redist project to only build the bits appropriate for the building platform. To make this work correctly in the official build I had to make sure these platform specific builds happen when the native build happens. * Only include LICENSE if it exists LICENSE is pulled from the Windows package, so it isn't available on mac or linux. * unit test to verify TF transform works with ML.NET image transforms * Factor TensorflowTransform into its own assembly/package * Update Tensorflow to TensorFlow * Fix manifest and C# API. Also, add entry point unit test. * update test case for image transforms; still skipping test for now till we figure out why false positive passes * Use IDataView instead of var in unit test. * fix have from Tensorflow to TensorFlow * Fix unit test to use new TextLoader APIs. * Add unit test using the pipeline API. * Validate input dimensions. * Added XML doc for TensorflowTransform. * Remove extra dimension for batch size in output. * Fix input/output validation. * Introduced BatchSize as constant to replace 1 everywhere. * enabling unit test of TensorFlowTransform working with ML.NET Image* Transforms * Extended XML docs with detailed information. * Ensure we include actual TF license in package The TF zip/tarballs were missing the actual TF license. Download this and include it in the package. Rename the file from the zip/tarballs as THIRD_PARTY_NOTICES.txt as that represents its content. * Change input validation to validate all dimensions in case of multi dimensional input. * Corrected typos in the documentation. * Fix LICENSE inclusion in package. We didn't define ExtractDirectory on the item and instead had a full path as identity. This worked on linux/osx since it prepended a ""/ to the path, which was tolerated by the file system (an extra leading slash). On Windows this doesn't work or course. Fix by appending to the item that doesn't assume files came from an archive. * Add symbols package and fix package reference to redist * Address pull request comments. * Give more details in input dimension mismatch error message. * Added a test for LearningPipelineAPI and updated the doc.xml
- Loading branch information
Showing
33 changed files
with
4,918 additions
and
24 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
21 changes: 21 additions & 0 deletions
21
pkg/Microsoft.ML.TensorFlow.Redist/Microsoft.ML.TensorFlow.Redist.nupkgproj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
<Project Sdk="Microsoft.NET.Sdk" DefaultTargets="Pack"> | ||
|
||
<PropertyGroup> | ||
<Authors>The TensorFlow Authors</Authors> | ||
<TargetFramework>netstandard2.0</TargetFramework> | ||
<PackageDescription>$(MSBuildProjectName) contains the TensorFlow C library version $(TensorFlowVersion) redistributed as a NuGet package.</PackageDescription> | ||
<PackageLicenseUrl>https://github.com/tensorflow/tensorflow/blob/master/LICENSE</PackageLicenseUrl> | ||
<PackageRequireLicenseAcceptance>true</PackageRequireLicenseAcceptance> | ||
<Copyright>Copyright 2018 The TensorFlow Authors. All rights reserved.</Copyright> | ||
<PackageProjectUrl>https://www.tensorflow.org</PackageProjectUrl> | ||
<PackageReleaseNotes>https://github.com/tensorflow/tensorflow/releases/tag/v$(TensorFlowVersion)</PackageReleaseNotes> | ||
<PackageTags>$(PackageTags) TensorFlow</PackageTags> | ||
<!-- TODO: consider PackageIconUrl --> | ||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<Content Include="..\common\CommonPackage.props" Pack="true" PackagePath="build\netstandard2.0\$(MSBuildProjectName).props" /> | ||
<Content Include="$(PackageAssetsPath)$(PackageIdFolderName)\LICENSE.txt" Pack="true" PackagePath=".\" /> | ||
<Content Include="$(PackageAssetsPath)$(PackageIdFolderName)\THIRD_PARTY_NOTICES.txt" Pack="true" PackagePath=".\" /> | ||
</ItemGroup> | ||
</Project> |
12 changes: 12 additions & 0 deletions
12
pkg/Microsoft.ML.TensorFlow/Microsoft.ML.TensorFlow.nupkgproj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
<Project Sdk="Microsoft.NET.Sdk" DefaultTargets="Pack"> | ||
|
||
<PropertyGroup> | ||
<TargetFramework>netstandard2.0</TargetFramework> | ||
<PackageDescription>Microsoft.ML.TensorFlow contains ML.NET integration of TensorFlow.</PackageDescription> | ||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<ProjectReference Include="..\Microsoft.ML.TensorFlow.Redist\Microsoft.ML.TensorFlow.Redist.nupkgproj" /> | ||
</ItemGroup> | ||
|
||
</Project> |
5 changes: 5 additions & 0 deletions
5
pkg/Microsoft.ML.TensorFlow/Microsoft.ML.TensorFlow.symbols.nupkgproj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<Project DefaultTargets="Pack"> | ||
|
||
<Import Project="Microsoft.ML.TensorFlow.nupkgproj" /> | ||
|
||
</Project> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
31 changes: 31 additions & 0 deletions
31
src/Microsoft.ML.TensorFlow/Microsoft.ML.TensorFlow.csproj
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<PropertyGroup> | ||
<TargetFramework>netstandard2.0</TargetFramework> | ||
<IncludeInPackage>Microsoft.ML.TensorFlow</IncludeInPackage> | ||
<DefineConstants>CORECLR</DefineConstants> | ||
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> | ||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<ProjectReference Include="..\Microsoft.ML.Core\Microsoft.ML.Core.csproj" /> | ||
<ProjectReference Include="..\Microsoft.ML.Data\Microsoft.ML.Data.csproj" /> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<Compile Update="TensorFlow\TensorGeneric.cs"> | ||
<DesignTime>True</DesignTime> | ||
<AutoGen>True</AutoGen> | ||
<DependentUpon>TensorGeneric.tt</DependentUpon> | ||
</Compile> | ||
<None Update="TensorFlow\TensorGeneric.tt"> | ||
<Generator>TextTemplatingFileGenerator</Generator> | ||
<LastGenOutput>TensorGeneric.cs</LastGenOutput> | ||
</None> | ||
</ItemGroup> | ||
|
||
<ItemGroup> | ||
<Service Include="{508349b6-6b84-4df5-91f0-309beebad82d}" /> | ||
</ItemGroup> | ||
|
||
</Project> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,211 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
using System; | ||
using System.Runtime.InteropServices; | ||
using System.Text; | ||
using size_t = System.UIntPtr; | ||
|
||
#pragma warning disable MSML_GeneralName | ||
#pragma warning disable MSML_ParameterLocalVarName | ||
|
||
namespace Microsoft.ML.Transforms.TensorFlow | ||
{ | ||
/// <summary> | ||
/// This attribute can be applied to callback functions that will be invoked | ||
/// from unmanaged code to managed code. | ||
/// </summary> | ||
/// <remarks> | ||
/// <code> | ||
/// [TensorFlow.MonoPInvokeCallback (typeof (BufferReleaseFunc))] | ||
/// internal static void MyFreeFunc (IntPtr data, IntPtr length){..} | ||
/// </code> | ||
/// </remarks> | ||
internal sealed class MonoPInvokeCallbackAttribute : Attribute | ||
{ | ||
/// <summary> | ||
/// Use this constructor to annotate the type of the callback function that | ||
/// will be invoked from unmanaged code. | ||
/// </summary> | ||
/// <param name="t">T.</param> | ||
public MonoPInvokeCallbackAttribute(Type t) { } | ||
} | ||
|
||
[StructLayout(LayoutKind.Sequential)] | ||
internal struct LLBuffer | ||
{ | ||
internal IntPtr data; | ||
internal size_t length; | ||
internal IntPtr data_deallocator; | ||
} | ||
|
||
/// <summary> | ||
/// Holds a block of data, suitable to pass, or retrieve from TensorFlow. | ||
/// </summary> | ||
/// <remarks> | ||
/// <para> | ||
/// Use the TFBuffer to blobs of data into TensorFlow, or to retrieve blocks | ||
/// of data out of TensorFlow. | ||
/// </para> | ||
/// <para> | ||
/// There are two constructors to wrap existing data, one to wrap blocks that are | ||
/// pointed to by an IntPtr and one that takes a byte array that we want to wrap. | ||
/// </para> | ||
/// <para> | ||
/// The empty constructor can be used to create a new TFBuffer that can be populated | ||
/// by the TensorFlow library and returned to user code. | ||
/// </para> | ||
/// <para> | ||
/// Typically, the data consists of a serialized protocol buffer, but other data | ||
/// may also be held in a buffer. | ||
/// </para> | ||
/// </remarks> | ||
// TODO: the string ctor | ||
// TODO: perhaps we should have an implicit byte [] conversion that just calls ToArray? | ||
internal class TFBuffer : TFDisposable | ||
{ | ||
// extern TF_Buffer * TF_NewBufferFromString (const void *proto, size_t proto_len); | ||
[DllImport(NativeBinding.TensorFlowLibrary)] | ||
private static extern unsafe LLBuffer* TF_NewBufferFromString(IntPtr proto, IntPtr proto_len); | ||
|
||
// extern TF_Buffer * TF_NewBuffer (); | ||
[DllImport(NativeBinding.TensorFlowLibrary)] | ||
private static extern unsafe LLBuffer* TF_NewBuffer(); | ||
|
||
internal TFBuffer(IntPtr handle) : base(handle) { } | ||
|
||
/// <summary> | ||
/// Initializes a new instance of the <see cref="T:TensorFlow.TFBuffer"/> class. | ||
/// </summary> | ||
public unsafe TFBuffer() : base((IntPtr)TF_NewBuffer()) | ||
{ | ||
} | ||
|
||
/// <summary> | ||
/// Signature of the method that is invoked to release the data. | ||
/// </summary> | ||
/// <remarks> | ||
/// Methods of this signature are invoked with the data pointer and the | ||
/// lenght pointer when then TFBuffer no longer needs to hold on to the | ||
/// data. If you are using this on platforms with static compilation | ||
/// like iOS, you need to annotate your callback with the MonoPInvokeCallbackAttribute, | ||
/// like this: | ||
/// | ||
/// <code> | ||
/// [TensorFlow.MonoPInvokeCallback (typeof (BufferReleaseFunc))] | ||
/// internal static void MyFreeFunc (IntPtr data, IntPtr length){..} | ||
/// </code> | ||
/// </remarks> | ||
public delegate void BufferReleaseFunc(IntPtr data, IntPtr lenght); | ||
|
||
/// <summary> | ||
/// Initializes a new instance of the <see cref="T:TensorFlow.TFBuffer"/> by wrapping the unmanaged resource pointed by the buffer. | ||
/// </summary> | ||
/// <param name="buffer">Pointer to the data that will be wrapped.</param> | ||
/// <param name="size">The size of the buffer to wrap.</param> | ||
/// <param name="release">Optional, if not null, this method will be invoked to release the block.</param> | ||
/// <remarks> | ||
/// This constructor wraps the buffer as a the data to be held by the <see cref="T:TensorFlow.TFBuffer"/>, | ||
/// if the release parameter is null, then you must ensure that the data is not released before the TFBuffer | ||
/// is no longer in use. If the value is not null, the provided method will be invoked to release | ||
/// the data when the TFBuffer is disposed, or the contents of the buffer replaced. | ||
/// </remarks> | ||
public unsafe TFBuffer(IntPtr buffer, long size, BufferReleaseFunc release) : base((IntPtr)TF_NewBuffer()) | ||
{ | ||
LLBuffer* buf = (LLBuffer*)handle; | ||
buf->data = buffer; | ||
buf->length = (size_t)size; | ||
if (release == null) | ||
buf->data_deallocator = IntPtr.Zero; | ||
else | ||
buf->data_deallocator = Marshal.GetFunctionPointerForDelegate(release); | ||
} | ||
|
||
[MonoPInvokeCallback(typeof(BufferReleaseFunc))] | ||
internal static void FreeBlock(IntPtr data, IntPtr length) | ||
{ | ||
Marshal.FreeHGlobal(data); | ||
} | ||
|
||
internal static IntPtr FreeBufferFunc; | ||
internal static BufferReleaseFunc FreeBlockDelegate; | ||
|
||
static TFBuffer() | ||
{ | ||
FreeBlockDelegate = FreeBlock; | ||
FreeBufferFunc = Marshal.GetFunctionPointerForDelegate<BufferReleaseFunc>(FreeBlockDelegate); | ||
} | ||
|
||
/// <summary> | ||
/// Initializes a new instance of the <see cref="T:TensorFlow.TFBuffer"/> by making a copy of the provided byte array. | ||
/// </summary> | ||
/// <param name="buffer">Buffer of data that will be wrapped.</param> | ||
/// <remarks> | ||
/// This constructor makes a copy of the data into an unmanaged buffer, | ||
/// so the byte array is not pinned. | ||
/// </remarks> | ||
public TFBuffer(byte[] buffer) : this(buffer, 0, buffer.Length) { } | ||
|
||
/// <summary> | ||
/// Initializes a new instance of the <see cref="T:TensorFlow.TFBuffer"/> by making a copy of the provided byte array. | ||
/// </summary> | ||
/// <param name="buffer">Buffer of data that will be wrapped.</param> | ||
/// <param name="start">Starting offset into the buffer to wrap.</param> | ||
/// <param name="count">Number of bytes from the buffer to keep.</param> | ||
/// <remarks> | ||
/// This constructor makes a copy of the data into an unmanaged buffer, | ||
/// so the byte array is not pinned. | ||
/// </remarks> | ||
public TFBuffer(byte[] buffer, int start, int count) : this() | ||
{ | ||
if (start < 0 || start >= buffer.Length) | ||
throw new ArgumentException("start"); | ||
if (count < 0 || count > buffer.Length - start) | ||
throw new ArgumentException("count"); | ||
unsafe | ||
{ | ||
LLBuffer* buf = LLBuffer; | ||
buf->data = Marshal.AllocHGlobal(count); | ||
Marshal.Copy(buffer, start, buf->data, count); | ||
buf->length = (size_t)count; | ||
buf->data_deallocator = FreeBufferFunc; | ||
} | ||
} | ||
|
||
internal unsafe LLBuffer* LLBuffer => (LLBuffer*)handle; | ||
|
||
// extern void TF_DeleteBuffer (TF_Buffer *); | ||
[DllImport(NativeBinding.TensorFlowLibrary)] | ||
private static extern unsafe void TF_DeleteBuffer(LLBuffer* buffer); | ||
|
||
internal override void NativeDispose(IntPtr handle) | ||
{ | ||
unsafe { TF_DeleteBuffer((LLBuffer*)handle); } | ||
} | ||
|
||
// extern TF_Buffer TF_GetBuffer (TF_Buffer *buffer); | ||
[DllImport(NativeBinding.TensorFlowLibrary)] | ||
private static extern unsafe LLBuffer TF_GetBuffer(LLBuffer* buffer); | ||
|
||
/// <summary> | ||
/// Returns a byte array representing the data wrapped by this buffer. | ||
/// </summary> | ||
/// <returns>The array.</returns> | ||
public byte[] ToArray() | ||
{ | ||
if (handle == IntPtr.Zero) | ||
return null; | ||
|
||
unsafe | ||
{ | ||
var lb = (LLBuffer*)handle; | ||
|
||
var result = new byte[(int)lb->length]; | ||
Marshal.Copy(lb->data, result, 0, (int)lb->length); | ||
|
||
return result; | ||
} | ||
} | ||
} | ||
} |
Oops, something went wrong.