mlpack IRC logs, 2017-05-12

Logs for the day 2017-05-12 (starts at 0:00 UTC) are shown below.

May 2017
Sun
Mon
Tue
Wed
Thu
Fri
Sat
 
1
2
3
4
5
6
7
8
9
10
11
12
--- Log opened Fri May 12 00:00:03 2017
00:00 < rcurtin> hello there everyone, I think I will wait another minute or two for a few people that I haven't seen yet
00:00 < zoq> yeah, good idea :)
00:02 < Kirill> hey guys
00:02 -!- govg [~govg@unaffiliated/govg] has joined #mlpack
00:02 < chenzhe> hi!
00:03 < zoq> Hello there!
00:03 < rcurtin> hey there Kirill, glad you could make it :)
00:03 < rcurtin> I thought maybe I had sent the invitation to the wrong email since I didn't see you this morning, but I guess you got it in the end :)
00:03 < Kirill> yeah, I did
00:05 < rcurtin> ok, I guess I will go ahead and start, people can read the logs if they need to catch up :)
00:05 < rcurtin> so, hello everyone, thanks for coming to the meeting! this is the closest we can get to an in-person meeting since we are all so far apart
00:06 < rcurtin> congratulations to the students on your acceptance!
00:06 < rcurtin> as you probably know, this is the second instance of this meeting, so basically I'll be talking about the same things
00:06 < rcurtin> the meeting is logged, and you can find the logs at http://www.mlpack.org/irc/
00:06 < rcurtin> but for some reason the log is not fully working, so I can't seem to load the messages from earlier today... I will have to look into that...
00:07 < rcurtin> anyway, I think you all know me, I'm Ryan Curtin, the mlpack GSoC organization administrator
00:07 < chenzhe> It seems to be quite a long log to read ^_^
00:07 < rcurtin> yeah, many messages :)
00:07 < rcurtin> this is the biggest GSoC mlpack has had; this year we have 10 students. In previous years we've had 6, 5, and 3... so this is a much bigger logistical challenge!
00:08 < rcurtin> I believe I have sent everyone an email with useful links, documentation, and other information about the project; if you didn't get that email, let me know and I'll send it to you (and make sure I send everything to your correct email in the future)
00:08 < rcurtin> as far as schedule goes...
00:08 < rcurtin> right now is the "Community Bonding" period, which goes from May 4 to May 30
00:09 < rcurtin> ideally, during this time, you can get to know your mentor a bit, get to know the community a bit, maybe have some fun, maybe learn a bit more about mlpack
00:09 < rcurtin> some nice ways to do this are on the mailing list, or direct emails to your mentor, or here on IRC---don't feel restricted to talk only about mlpack; we can have some fun in the channel also :)
00:10 < rcurtin> once the community bonding period is over, the actual coding goes from may 30 to august 21
00:10 < rcurtin> during that time, there will be two midterm evaluation periods; one at the end of June, and one at the end of July
00:10 < rcurtin> then, at the end of the summer there's a final evluation
00:10 < rcurtin> after that, there's no schedule imposed by Google, but we hope that you will stick around and continue to participate in the project :)
00:10 < rcurtin> if there are any stipend issues or administrative issues, probably Google will be the most helpful there but you should feel free to ask me and I can try and help or escalate to Google as needed
00:11 < rcurtin> so, before I moved on, any questions about schedule or anything? next we can talk about student and mentor expectations
00:12 < rcurtin> ok, I will take that as a no :)
00:12 < Kirill> yeah, we got it
00:12 < rcurtin> I don't think that any of the expectations here are too difficult, but I think it's important to discuss these before the summer starts to ensure we are all on the same page
00:12 < rcurtin> students are expected to work the equivalent of a full-time job or internship; so, a full work week
00:12 < rcurtin> it's okay if some weeks you work more and some weeks you work less, but in the end it should even out
00:13 < rcurtin> it's also okay if you have to travel or will be unavailable, but please make sure your mentor knows that you'll be gone
00:13 < rcurtin> disappearing students are a big problem in GSoC, so if we don't hear from you for a little while we may start to get scared that you have disappeared :)
00:14 < rcurtin> regular contact and communication with mentors is very important and expected; preferably, this would be via the public #mlpack IRC channel, but if you or your mentor prefer alternate means of contact that is okay too
00:14 < rcurtin> the reason we suggest using the public channel or public mailing list is so that more people than just your mentor can answer any questions; sometimes this can be helpful
00:14 < rcurtin> if you are having trouble with some part of your project, definitely do not be afraid to ask---that is what the mentor is there for
00:14 < rcurtin> students are also expected to provide some kind of weekly status update to the community
00:14 < rcurtin> this could be an email post, like these examples:
00:15 < rcurtin> https://mailman.cc.gatech.edu/pipermail/mlpack/2013-July/000142.html
00:15 < rcurtin> https://mailman.cc.gatech.edu/pipermail/mlpack/2016-July/001022.html
00:15 < rcurtin> or it could be a blog post, like these:
00:15 < rcurtin> http://mlpack.org/gsocblog/improvement-of-automatic-benchmarking-system-week-10-highlights.html
00:15 < rcurtin> http://mlpack.org/gsocblog/approximate-nearest-neighbor-search-week-11.html
00:15 < rcurtin> whichever way you'd like to do it is up to you, but weekly updates are important because people in the community may be interested in following what you are up to in your project
00:15 < rcurtin> the blog posts are done through a Github repository at https://github.com/mlpack/blog
00:15 < rcurtin> and I'll make sure you have the right permissions to post there after the meeting
00:16 -!- govg [~govg@unaffiliated/govg] has quit [Ping timeout: 240 seconds]
00:16 < rcurtin> not every project goes according to plan; sometimes, the project may fall way behind the timeline, or it may proceed way ahead of the timeline
00:16 < rcurtin> this is not necessarily a problem, do not worry!
00:16 < rcurtin> if this does happen, the student and the mentor should discuss to see what is realistically accomplishable in the rest of the summer and adjust the goals accordingly
00:17 < rcurtin> many GSoC projects don't get completed in the original way they were defined, so don't worry if things change during the summer---estimating how much work some software can take is a very hard task, and almost nobody gets it right
00:17 < rcurtin> we really hope we won't fail any students this year, and we certainly don't expect to---every accepted student is a good student, so the prior probability of failure is very low in our opinions
00:18 < rcurtin> but, we will fail a student who disappears, doesn't appear to be working the expected amount, or who is otherwise seriously underperforming
00:18 < rcurtin> if such a situation does occur, the student will be made fully aware with warnings, so any failure will not be unexpected
00:18 < rcurtin> like I said, I really don't expect this to be an issue, but it's important to talk about it beforehand in case any issues actually do arise, so that we are on the same page
00:18 < rcurtin> any questions about student expectations? if not, I'll go ahead and move onto mentor expectations
00:19 < zoq> nothing from my end :)
00:20 < rcurtin> this meeting is going faster than the last one, not very many questions :)
00:20 < rcurtin> no complaints from my end about that :)
00:20 < Kirill> :)
00:20 < rcurtin> next I'll talk about mentor expectations, which are also important to set out beforehand
00:20 < rcurtin> mentors should work with their students to determine times that they are both available to work together
00:21 < rcurtin> they should also be willing to debug code problems and help the student understand any theory as needed
00:21 < rcurtin> similar to how students are expected to be available and in contact with their mentors, mentors are also expected to be responsive and in contact with their students, providing reasonably quick replies
00:21 < rcurtin> the mentor shouldn't do the majority of the work on the project, of course, but they should be there to help as needed
00:22 -!- stephentu [~stephentu@c-98-248-250-153.hsd1.ca.comcast.net] has joined #mlpack
00:22 < rcurtin> like earlier, the best form of communication for a student-mentor pair is going to depend on the preferences of the student and mentor, but ideally this may be best to do in public, to allow others to contribute/observe/ask questions if they might like
00:23 < rcurtin> hey Stephen :) if you want to catch up, some logs are here: http://mlpack.org/irc/
00:23 < rcurtin> (or maybe you already read the logs from this morning, in which case, you probably know what I'm about to say! :))
00:24 < rcurtin> the two midterm evaluations and final evaluation need to be done by the mentor during the periods described by Google (end of June, end of July, and end of August)
00:24 < rcurtin> if there is some problem, I think I can enter the evaluations as an organization administrator
00:24 < rcurtin> but as always, if there's any problem you can just ask me (or zoq or whoever) and we can all try and get it figured out
00:24 < rcurtin> any questions about mentor expectations?
00:25 < rcurtin> also, can everyone send me the github account they'd like to use this summer so I can add them to the right Github team for GSoC students?
00:25 < Kirill> ok
00:25 < stephentu> i'm glad i joined around the time you were talking about mentor expectations lol
00:25 < rcurtin> :)
00:25 < stephentu> sorry i had a meeting run late
00:25 < Kirill> here or throug mail?
00:25 < rcurtin> no problem, don't worry about it
00:25 < rcurtin> here is fine, you can just give me the ID
00:25 < Kirill> micyril
00:25 < rcurtin> Kirill, I think I already have yours, I'm assuming you'll use the same one you have in the past
00:26 < rcurtin> yeah, right, let me add that now
00:26 < rcurtin> and Stephen you're already a member of the organization :)
00:26 < Kirill> yeah, I sent it just to make sure
00:26 < rcurtin> chenzhe: what Github account id should I add for you?
00:27 < rcurtin> I guess I need Kartik's also, but I think he is not here
00:27 < rcurtin> anyway, those can be added later, no need to do it now :)
00:27 < rcurtin> next I'll add some short history, I dunno if it will be interesting for anyone, but I think it is interesting :)
00:27 < rcurtin> mlpack was first developed in 2007 in a lab at Georgia Tech (so, over ten years now!)
00:28 -!- chenzhe1 [~Thunderbi@nat-5-176.uws.ualberta.ca] has joined #mlpack
00:28 < rcurtin> the lab had maybe ~10 people that contributed to the library early on, and I joined the effort around late 2009
00:28 < rcurtin> my job, when I joined the lab, (in addition to research...) was to prepare the library to actually release as open source
00:28 < rcurtin> but this took two full years of refactoring and a team of people, so mlpack 1.0.0 wasn't released until december 2011 at a NIPS workshop
00:29 -!- chenzhe [~Thunderbi@2620:101:c040:7f7:9059:b24e:2a5e:3903] has quit [Ping timeout: 246 seconds]
00:29 -!- chenzhe1 is now known as chenzhe
00:29 < rcurtin> at that point, the lab kind of died when the advisor (Alex Gray) left Georgia Tech to start a company called Skytree
00:29 < rcurtin> and the few remaining people (which I think at some point was just me) got involved with Google Summer of Code, and the community has grown a ton since then
00:29 < rcurtin> this is our fourth Summer of Code, and like I said earlier by far the biggest
00:30 < rcurtin> now the library has somewhere over 80 contributors, from all different continents except Antarctica
00:30 < rcurtin> and one of the contributors is actually a deep learning system: https://github.com/C0deAi
00:30 < rcurtin> there's also a pull request open from North Korea, so I think we are the only ML library with code that's come from there :)
00:31 < Kirill> :D
00:31 < rcurtin> it is very exciting that when I travel to conferences now, there is some good name recognition of mlpack---people know what it is, unlike in 2012
00:31 < rcurtin> I'm really hoping that many of the projects this summer will get a lot of interest from the larger machine learning community
00:31 < rcurtin> many of the projects are focused on the neural network code, which I am hoping we will be able to release as stable soon, and that will probably get a lot of interest
00:32 < stephentu> can i ask a general mlpack question
00:32 < rcurtin> in addition, I am currently working on automatic Python bindings for the command-line programs, and this should help bring more people to use mlpack (and maybe other languages too)
00:32 < rcurtin> of course, go ahead
00:32 < stephentu> how do you see mlpack w/ respect to all the other ML frameworks out there like sklearn, and all the DL frameworks like pytorch adn tensorflow
00:32 < stephentu> esp the ones w/ company backed support
00:32 < rcurtin> I think the focus is pretty different, or, at least it traditionally has been different
00:33 < rcurtin> technically we have a little bit of support via Symantec, but that's not the same thing as, e.g., Spark and Databricks :)
00:33 < rcurtin> traditionally mlpack focused on very fast implementations instead of ease of use
00:33 < rcurtin> since it's in C++, it's already much higher on the learning curve than most people want to climb
00:33 < rcurtin> mlpack also has typically focused on less "standard" algorithms, and has implementations of a lot of stuff you won't find elsewhere
00:33 < rcurtin> (though we have definitely added more 'standard' techniques over the years)
00:34 < stephentu> ya i think thats how i stubmled upon mlpack in teh 1st place
00:34 < stephentu> i was looking for some SDP implementation
00:34 < rcurtin> yeah, that's a decent example---mlpack has one of the better optimizer frameworks out there (in my opinion)
00:34 < stephentu> cool
00:35 < rcurtin> I think, moving forward, that the best path might be to focus on speed, then ease of use
00:35 < Kirill> does mlpack supports running in parallel on multi-core processors?
00:35 < rcurtin> Kirill: sort of; there is some OpenMP support for some algorithms
00:36 < rcurtin> and you could also use OpenBLAS inside of Armadillo to get parallel linear algebra
00:36 < rcurtin> Shikhar's project this year will focus on some parallelization too
00:36 < Kirill> ok
00:36 < rcurtin> that's not to say it's perfect; there is a lot of room for improvement :)
00:36 < rcurtin> but, I think maybe "there is a lot of room for improvement" applies to just about any code anywhere :)
00:37 < rcurtin> stephentu: back to your original question, I'm not fully sure how to see mlpack's DL code vs. pytorch or tensorflow yet
00:37 < rcurtin> I have to come up with something, because deep learning is really hot inside of Symantec and my goal is to get people to use mlpack inside of Symantec :)
00:38 < rcurtin> so I will have to figure out some persuasive arguments for why one would use mlpack vs. TF or whatever else... and I think maybe benchmarks will be a good part of that argument
00:39 < stephentu> good luck, its a tough battle to fight
00:39 < rcurtin> agreed, it absolutely is
00:39 < rcurtin> but in the mean time, I am having fun working on mlpack and I think there are places where we definitely provide some nice support that other libraries don't :)
00:40 < rcurtin> so I don't have too much insight on "where things will be" in a year or so, we will have to see, but I do know that the improvements from this summer's projects will be exciting and my opinion is they will capture a good amount of interest :)
00:40 < rcurtin> I guess, I'm out of things to say in the meeting, but I think maybe it is a good idea to have some introductions so we can get to know each other a little bit
00:40 < rcurtin> I don't have any formal structure, so maybe it will be chaotic, but...
00:41 < rcurtin> I'm Ryan, I live in Atlanta, I did my B.S., M.S., and Ph.D. at Georgia Tech, and now I work for Symantec and still managed to stay in Atlanta
00:41 < chenzhe> it seems that I just loss connection when you ask me about github account, did you get it?
00:41 < rcurtin> chenzhe: no, I didn't see the message, sorry about that
00:41 < chenzhe> czdiao
00:41 < rcurtin> excellent, thanks
00:41 < chenzhe> or maybe diao@ualberta.ca
00:41 < rcurtin> in my free time, my favorite hobby is racing go karts; it's a lot of fun --- http://ratml.org/misc_img/ironman_round_3.jpg is a picture
00:42 < rcurtin> that is my introduction, everyone else should feel free to introduce themselves :)
00:44 < stephentu> we are all too shy to introduce ourselves
00:44 < rcurtin> apparently so, that's okay :)
00:44 < stephentu> i guess i'll go
00:44 < rcurtin> the morning crowd was much more talkative :)
00:44 < stephentu> im one of the mentors this year.
00:44 < stephentu> i am a phd student at UC berkeley.
00:44 < stephentu> hoping to graduate someday
00:45 < rcurtin> but even if you don't, the weather is always nice so it is not a problem to stay in Berkeley :)
00:45 < stephentu> i've been trying to play more guitar lately
00:45 < stephentu> which is fu
00:45 < stephentu> fun
00:45 < stephentu> lol its also lots of FUUU
00:45 < stephentu> weather is great here, COL isn't as great
00:45 < stephentu> :(
00:45 < rcurtin> yeah I thought that's what you meant... I have been learning the bass and my experience is maybe more FUUU than fun :)
00:46 < rcurtin> yeah, that is a disadvantage to california :(
00:46 < chenzhe> what is COL?
00:47 < stephentu> cost of living
00:47 < chenzhe> that's true......
00:48 < chenzhe> Your last name seems to be Chinese
00:48 < stephentu> my parents are from taiwan
00:48 < stephentu> i was born in the US though
00:49 < chenzhe> I see, I guess you don't speak Chinese 😀
00:50 < stephentu> its been my goal for N years to improve chinese
00:50 < stephentu> but it never happens
00:50 < chenzhe> haha
00:50 < stephentu> so unfortunately we will be hosting our meetings in english
00:50 < stephentu> one of these days
00:50 < chenzhe> That's true~ Language is hard
00:51 < rcurtin> I have been trying to brush up my german, even that is difficult :)
00:51 < rcurtin> and nowhere as hard as chinese
00:51 < rcurtin> nowhere near*
00:52 < chenzhe> haha
00:52 < chenzhe> I can go now~ My name is Chenzhe, I am doing my Ph.D. in U of Alberta in Canada
00:53 < chenzhe> I work in Applied math, maybe graduating later this year or next year
00:54 < chenzhe> I like skiing~
00:54 < stephentu> like a true canadian
00:54 < chenzhe> That's how people can do in Canada
00:54 < chenzhe> I guess you cannot do anything else in the long winter
00:54 < rcurtin> perfect place to like skiing :)
00:55 < rcurtin> atlanta is too hot for that, there are not big mountains and not much snow
00:55 < chenzhe> We just said goodbye to last big snow a few weeks ago
00:55 < stephentu> where are you originally from?
00:56 < rcurtin> it was 90F here today :(
00:56 < chenzhe> Mainland China, you might heard about an ancient city named Xi'An
00:56 < Kirill> chenzhe, here in Russia also snow was lately
00:56 < Kirill> so, you are not alone
00:56 < chenzhe> haha
00:57 < chenzhe> We finally got spring now, it's about 20 C
00:57 < stephentu> chenzhe: there is this delicioius place in NYC called xian's famous foods
00:58 < stephentu> do you know if it is the same xi'an?
00:59 < chenzhe> really? When I was in Flushing, I remembered there is a small restaurant named Biang, which is a very complicated Chinese word
00:59 < stephentu> http://xianfoods.com/
00:59 < stephentu> delicious
00:59 < stephentu> very bad for you
00:59 < stephentu> but delicious
01:00 < Kirill> So, maybe it's a good time to introduce myself
01:00 < chenzhe> Looks similar, this city is actually famous for all kinds of noodles
01:00 < chenzhe> sure
01:01 < Kirill> my name is Kirill as you can guess :)
01:02 < Kirill> I'm a PhD student at Ural Federal Univercity (Ekaterinburg, Russia) working on Computational Humor
01:02 < Kirill> This summer I'm going to work on cross-validation and hyper-parameter tuning infrastructure
01:03 < stephentu> Kirill: will you be experimenting w/ any of these bayesian methods for hyperparam selection?
01:03 < Kirill> In my free time I like to walk and cycle with friends
01:04 < Kirill> stephentu: it will be beyond the scope of this summer, but it can be extanded in this in the future
01:05 < Kirill> I hope to make it flexible enough to make it possible
01:05 < stephentu> cool, ya its definitely a lot of work to implement that stuff
01:05 < rcurtin> yeah, I am very excited about this project, but you probably already know that from the emails :)
01:07 < stephentu> so rcurtin can you explain a bit more about what community bonding entails
01:07 < stephentu> maybe i missed this
01:07 < stephentu> we're not techically supposed to get started yet
01:07 < stephentu> from what i can tell
01:07 < rcurtin> yeah, the students don't need to write any code yet (unless they want to)
01:08 < rcurtin> the idea is for students to integrate into the community and get to know the other students and mentors
01:08 < rcurtin> so that the summer itself is more fun and feels less like a heartless consulting job :)
01:08 < stephentu> haha
01:08 < rcurtin> in a real job I guess this would be the equivalent of water cooler discussions
01:09 < rcurtin> but we don't have an international water cooler, just #mlpack and the mailing list :(
01:09 < rcurtin> (and whatever other communication methods)
01:11 < rcurtin> there are no strict and hard requirements for the community bonding period though, just a general idea :)
01:16 < rcurtin> ok, I guess there is nothing else for now, maybe we will hear from Kartik and Sumedh in the future :)
01:17 < rcurtin> feel free to idle in the channel and chat!
01:17 < chenzhe> Sure
01:17 < rcurtin> thank you everyone for attending the meeting
01:17 < rcurtin> I'll be around on and off for the next couple hours before I go to bed
01:17 < rcurtin> I am looking forward to the summer :)
01:23 -!- Kirill [5855c2e3@gateway/web/freenode/ip.88.85.194.227] has quit [Ping timeout: 260 seconds]
01:26 < stephentu> great thanks
01:26 < stephentu> i'll try to come on irc more
01:26 -!- stephentu [~stephentu@c-98-248-250-153.hsd1.ca.comcast.net] has quit [Quit: Lost terminal]
02:24 -!- sumedhghaisas [~sumedh@188.74.64.249] has joined #mlpack
02:35 -!- mikeling [uid89706@gateway/web/irccloud.com/x-jyzwaovxnpyiydcb] has joined #mlpack
03:10 -!- chenzhe [~Thunderbi@nat-5-176.uws.ualberta.ca] has quit [Ping timeout: 260 seconds]
04:59 -!- govg [~govg@unaffiliated/govg] has joined #mlpack
05:38 -!- govg [~govg@unaffiliated/govg] has quit [Ping timeout: 240 seconds]
06:10 -!- vivekp [~vivek@unaffiliated/vivekp] has quit [Ping timeout: 272 seconds]
06:10 -!- vpal [~vivek@unaffiliated/vivekp] has joined #mlpack
06:11 -!- vpal is now known as vivekp
07:33 -!- mentekid [~yannis@cpc92878-cmbg18-2-0-cust813.5-4.cable.virginm.net] has joined #mlpack
10:47 -!- sumedhghaisas [~sumedh@188.74.64.249] has quit [Ping timeout: 240 seconds]
11:12 -!- sumedhghaisas [~sumedh@188.74.64.249] has joined #mlpack
13:22 -!- govg [~govg@unaffiliated/govg] has joined #mlpack
13:28 -!- sumedhghaisas [~sumedh@188.74.64.249] has quit [Ping timeout: 240 seconds]
15:32 < rcurtin> ok, I set up the benchmarks repository to post to the mlpack-git mailing list when commits are pushed
15:32 < rcurtin> I'll also get PRs and issues set up for the other repositories to send emails to the mlpack-git mailing list
15:33 < zoq> good idea and there is the first one
15:35 < rcurtin> ok, that should be set up correctly now
15:35 < rcurtin> now since I am on the mlpack-git mailing list I have to now set my personal github account to ignore all activity on the blog and benchmarks repositories
15:36 < zoq> yeah, a new filter for me as well
17:00 -!- sumedhghaisas [~sumedh@188.74.64.249] has joined #mlpack
17:49 -!- chenzhe [~Thunderbi@2620:101:c040:7f7:6d47:e371:392e:36d5] has joined #mlpack
18:32 -!- mikeling [uid89706@gateway/web/irccloud.com/x-jyzwaovxnpyiydcb] has quit [Quit: Connection closed for inactivity]
20:09 -!- chenzhe [~Thunderbi@2620:101:c040:7f7:6d47:e371:392e:36d5] has quit [Ping timeout: 260 seconds]
20:51 -!- chenzhe [~Thunderbi@2620:101:c040:7f7:7179:edc1:a0e0:d0b2] has joined #mlpack
23:39 -!- sumedhghaisas [~sumedh@188.74.64.249] has quit [Ping timeout: 240 seconds]
--- Log closed Sat May 13 00:00:04 2017