[mlpack] Is scanning business transactions for fraud an appropriate use of MLpack?

Ryan Curtin ryan at ratml.org
Tue Nov 6 20:46:37 EST 2018


On Tue, Nov 06, 2018 at 02:33:56PM -0500, Rick Hedin wrote:
> Hi.  Could you give me a reading on whether MLpack is an appropriate tool
> for what I want to do?  Too often, you start down a path, and after a few
> weeks you realize "Oh.  I shouldn't be doing this."

Hey there Rick,

Always good to check first. :)  I'll do my best to provide useful
answers...

> I would like to put an AI process on the message stream, transparent
> to other uses of the message stream.  When one of our operators marks
> a transaction as "possibly fraudulent," that would be a data item for
> the AI process.  When they later mark it "definitely fraudulent" or
> "definitely not fraudulent," those are also data items for the AI.
> Eventually, the AI would be able to add additional tags in the record
> "AI suspects this transaction is fraudulent" or "AI suspects this
> transaction is not fraudulent," along with "AI confidence is xxx%."
> 
> The nice thing about this setup is nobody has to spend hours training it.
> The data stream provides both data, and judgement on the data.
> 
> So, is this a good application for MLpack?  Or is it more intended for
> other purposes, and a different software suite is more appropriate?

So, I think mlpack could work for this but keep in mind a lot of the
system development here will be preparing the input to give to mlpack so
that mlpack can make the predictions.

mlpack does all its predictions on numeric data; so, for instance, if
you have a dataset full of words, you'll need to convert these words to
numeric values as one-hot encoding, or perhaps by an embedding or TF-IDF
or something like this.

Note that mlpack does have Python bindings, so if you're working from
Python it might fit really nicely into a Python workflow.

Hope that this is helpful!

Thanks,

Ryan

-- 
Ryan Curtin    | "Avoid the planet Earth at all costs."
ryan at ratml.org |   - The President


More information about the mlpack mailing list