Wednesday, July 24, 2019

My mortgage company sucks at security

TL;DR


My mortgage company had all of its customer documents (monthly statements, mortgage applications, etc.) easily available for public download with no login required, no automated monitoring, and trivial discoverability.

Background


Although I write security-related software for a living, I'm not by any means a security engineer or hacker.  Basic SQL injection and the crypto stuff covered in the Matasano cryptopals challenges are about the only "hacking"-type things I know how to do.  That said, every once in a while there are sites that have security flaws so obvious that even I notice them.  This is a story of one of those times.

Note: The events described below occurred in November 2017.  I'm posting about them so long after-the-fact not due to any agreement or requirement to wait, but due to being busy and just not getting around to it until now.

27 November 2017


Because I loosely keep an eye on my finances, I wanted to check how much was left on my mortgage (answer: as always, a depressingly large amount).  Easy enough.  My mortgage servicer has a website.  Log on, go to the documents section, grab my latest statement.

Quick aside for a bit of background info: There are three usual ways filenames get constructed when you download something from a website (and all of them are chosen by the website you're downloading from and not your browser).  The first is to use some form of info about the document (e.g., "Statement-October-2017.pdf").  The second is to always have every document end up with the same name (e.g., "document.pdf").  The third is to use some random UUID, possibly but not necessarily related to the identifier used to track the document internally (e.g., "1af28f37-7f1f-4fde-be8f-8ab1d5bc0bee.pdf").

Anyway, back to me grabbing a mortgage statement.  I download the most recent one, and immediately notice that the filename is unusual.  Instead of being one of the three formats I mentioned above, it's something like 225123456.pdf (that isn't the exact number, but the length and the first 3 digits are the same).  That may appear to a casual observer to be a random number, but when it comes to computers, that's nowhere near large enough to be a random number.  The UUID I gave as an example above is a standardized format; that contains 3.4x10^38 possible values[1], whereas a 9-digit number gives you, well, 10^9 possibilities.  That's far too small to use as a random number in this context, but it's also concerning for another reason.

In order to allay my fears that my mortgage company is grossly negligent, I download my previous statement.  224345678.pdf (again, not the exact number, but something like that).

That's not good.  The company definitely appears to be numbering documents sequentially, which makes it really easy to get an estimate of how many mortgages they service[2].  Although that isn't top-secret information, it's probably something that they don't want to be publicly known with as much detail.  To make sure it isn't just random chance, I grab a third document, from a year or two back.

That's when I start to get very, very concerned.  The filename is exactly what I was expecting (that is, a number somewhere less than 224345678), but more concerningly, the download link catches my attention due to its simplicity: https://[company website].com/[simple path]/?documentid=123456789

They can't possibly be as bad at this as I think they might be, so I grab my current statement's ID, decrement it by one, and load the URL https://[company website].com/[simple path]/?documentid=225123455.

It's someone else's mortgage statement for that month.  This means that best case, they only perform an authentication check when you try to download a document, not an authorization check (basically, they make sure you're logged in, but not that you should actually have access to the given document).  In order to see if they even check authentication, I grab the link for my statement and go to the command line.

wget 'https://[company website].com/[simple path]/?documentid=225123456' -O 'statement.pdf'

Open statement.pdf and it's the correct statement.  Turns out that there's not even a requirement that the user be logged in.  My browser has cookies that identify who I am, but wget doesn't have any of those, it's just an anonymous GET request, and it still worked.  Is it doing some unexpectedly-fancy IP-based auth check[3]?  Hop up into the cloud, ensure I have a different IP address and a clean computer with no info whatsoever about my account.  Still works.

Just to verify the severity before alerting the company (so that I know whether it's a "you really need to fix this ASAP"-level bug or if it's a "you will go out of business if you don't fix this ASAP"-level bug), I try downloading the document with ID 1.  It's a letter to a customer from 2007 (if I remember correctly, it's informing them that their adjustable mortgage rate was getting adjusted).  I check another random number.  It's a mortgage application, with all of the personal information that entails.

Alright, no more wasting time.  Delete everything I've downloaded.  Send a message through their customer support portal asking for the contact info of their security team, and do the same for the contact us link of their parent company.  Look up the WHOIS contact addresses for their domain and their parent company's domain, and email them, as well.

Nothing else I can do until I get a response, so I set up a cron job that, every minute, tries to download a selection of my statements without providing any auth info, and records a metric of whether or not it succeeded.

28 November 2017


Get the contact info for their security folks and let them know what's up.  They confirm that it's a problem and 75 minutes after I sent them the details, the website is inaccessible with a message that it's "currently down for routine maintenance".  Later that day, it's back up, using a seemingly secure method for downloading documents (I didn't dig all that deeply as I'm not actually a security engineer/hacker/whatever).  All things considered,  if you ignore the colossal mistake that let to it being necessary, that's a good response time.

A couple days later


Chat with their CIO.  They don't officially have a bug bounty program, but I've now got a framed thank-you letter and check (mobile phone check depositing FTW) on the wall above my desk.  He seemed a bit annoyed and requested that I donate the money to charity[4], but specifically said that he couldn't require it (obviously), and, importantly, specifically said that there weren't any restrictions on talking about the event.

They said that they identified the people whose records were accessed without auth and that it was just the handful that were part of my investigation, but I'm not sure if I believe them.  Well, I'm unsure if I believe them when they say that they found everyone they think was impacted.  I am fairly sure that even if they think they identified everyone impacted, they probably didn't.

I don't see any other mentions online of their massive security flaw, so it's possible nobody else noticed, but it's equally possible that roughly a quarter-billion mortgage documents are now floating around on the black market.  It seems pretty safe to assume that they didn't have any automated system in place that would've detected someone exploiting this, as I had a cron job repeatedly downloading my own statements without logging in and yet they weren't aware of the issue until I informed them.

Had someone known about this and wanted to be malicious, and assuming the mortgage company's website has reasonable network capacity (which seems safe to assume given that they have hundreds of thousands of customers, and that I've never had any latency or similar issues with their website), an attacker could've downloaded every document in a couple days for a couple thousand dollars of cloud computer time and tens or hundreds of dollars of hard drive space.  So, it's probably a good idea to assume that if your mortgage is managed by the company, all your info is now for sale on some Tor marketplace.

Infosec analysis


When assessing the severity of a vulnerability, one common tactic is to calculate a DREAD score.  That stands for Damage, Reproducibility, Exploitability, Affected users, and Discoverability.  This issue just about maxes out all of them.  Let's go through it based on this arbitrarily-chosen rubric (it was the first one on Google that's applicable to this type of vulnerability).

Damage


The damage is that all of a customer's data is revealed.  Mortgage statements are an issue, but mortgage applications, which were included, are catastrophic, as they contain literally all of the information necessary to steal someone's identity (by definition, they include everything you need to apply for a mortgage while pretending to be that person).

Let's say that's a 7.  The most sensitive data the company has is compromised, but there's no direct ability to change things.

Reproducibility


Until fixed, this could be reliably reproduced every time it was attempted.  That's definitely a 10.

Exploitability


This can be easily exploited using preexisting command-line tools and scripting languages.  In order to download everything it isn't literally point-and-click, but downloading everything only requires incredibly basic scripting (wget and a for loop) and downloading selected documents can be down trivially in a web browser.  Let's give this a 9 as it is theoretically possible for them to make it easier.

Affected Users


This is a bit iffy.  Generally these scores are calculated for issues in frameworks used by a number of different systems, and this refers to how many of those systems are vulnerable.  However, in this case, it appears to be a problem with a company-specific stack.  As 100% of the possible targets (that is, this one company) are affected, and it exposes the data of 100% of their users, I'm giving this a 10, but it might also be fair to exclude this category from the overall score calculation.

Discoverability


Before this was fixed, it was easily discoverable by any customer.  However, it likely wouldn't have been discovered by any standard scanning tools (that look for publicly-known security issues).  There wasn't public info detailing this specific attack against this specific website, but lack of auth for downloads is a commonly-known problem.  Let's give this a 6, one point below "there's a published guide".

Overall


7 + 10 + 9 + 10 + 6 = 42/50.  According to the rubric, "Findings with this risk rating, if not quickly addressed, may pose risks that could negatively impact business operations or business continuity."  That sounds about right.

But wait, you never said what company it was.  How can I know if I might be impacted?


Didn't want to distract from the story.  The company is Specialized Loan Servicing, a subsidiary of Computershare.

How do I know this isn't all made up in an attempt to slander a well-respected and much-loved company?


I've got emails documenting all of this, but to protect the specific identities of the employees involved I'm not going to post them here.  However, if it's necessary at any point in time for verification purposes, they're available.

Footnotes


[1] To give a mortgage-related-example of how big that is, it's roughly enough to provide a mortgage statement to every person on the planet, every microsecond, for the entire age of the universe.  And then, for good measure, for the next 100000 ages of the universe, as well.

[2] Just comparing two adjacent-month statements won't give you a great estimate, because the order they process each one in a given month may vary, and there'll be other assorted documents mixed in, but by looking at when and how many non-statement documents are posting in your account (to figure out what fraction of documents are statements in a given month of the year), and taking a long-term average of the difference between statement filenames, you'll be able to get an estimate of both the number of mortgages they manage and how that number changes over time.

[3] The answer to this is obviously "of course not, you'd only bother with that after several other auth layers that you've already demonstrated are completely absent", but I felt the urge to check nonetheless.

[4] CIO guy, if you're reading this, I annually donate more money than I got from this, so if it makes you feel any better you can consider your check part of that for 2017.

Thursday, June 13, 2019

Making Homemade Fountain Pen Ink

Intro


Feel free to skip this if you don't care about backstory


I've used fountain pens off and on for the last 10+ years, but I've gotten more into it since a work trip to the UK last year reintroduced me to them (found a couple cheap ones in a convenience store while grabbing dinner and decided to try them out).  I've played around a bit with reshaping nibs, as I've got a bunch of abrasives and polishing stuff lying around from other projects, but it turns out that the people who make nibs generally seem to know what they're doing, and it's tough to improve on them all that much (at least with my current skill level).

As I'm not myself if I'm not hilariously overbooked with unnecessary side projects, I instead decided to look at custom inks.  I'm picky when it comes to ink color and, at $10-$20 a bottle, it gets very expensive very quickly trying to find just the right shade.  Opinions vary on whether mixing commercial inks together is a good idea, and it hasn't caused any problems when I've tried in the past, but that seemed too easy.  I tried to find recipes online for homemade inks, but it turns out that there isn't much info available.

It seems everyone who has made their own falls into one of three categories: a) it was long enough ago that link rot has removed it from the web, b) it's buried in some thread that isn't well-indexed by Google, or, annoyingly but commonly, c) they're explicitly keeping the exact composition private as a "trade secret", because they want to sell their ink with minimal competition.

I have no intention of selling fountain pen ink, as I'm no longer in college, have a real job, have too many hobbies already, etc., so I'm including exact ingredient lists below.  As I make new inks in the future, I'll either update this post (for minor pieces of info) or make new posts with the details of those inks, as well.

If anyone has questions about the exact processes followed, exact ingredients used, anything like that, definitely ask and I'll do my best to answer.  I'm one of those quasi-libertarian OSS contributors who thinks information should be free for the masses; no reason fountain pen ink recipes should be any different.

Ink overview


A lot of the message threads I found referenced a Google Sites page that's no longer there, but the general gist I was able to glean is that fountain pen ink has 4 or 5 ingredients:

-Water - fairly obvious.

-Dye - gives the ink color.  Too little and the ink is too light, too much and it's too dark (and may also clog your pen).

-Surfactant - reduces surface tension.  Too little and it won't flow well, too much and it'll bleed/feather once on paper.

-Humectant - slows evaporation rate.  Too little and it'll dry up on the nib, too much and it'll be annoyingly slow to dry on the page.

-Biocide/preservative (optional) - stop mold.  Too little and the ink will go bad, too much and it'll be poisonous.

The other takeaway was that it's pretty easy to make, although a) it's really tough to get consistent colors due to needing to keep all the ingredient amounts constant, and b) there's a risk of messing up pens due to clogging or corrosiveness.  As I've got a milligram scale for measuring pure caffeine, keeping ingredients consistent shouldn't be too crazy difficult, and as I've got dozens of $1-$2 pens (I buy Jinhao sharks by the case specifically for trying out inks) I'm not concerned about a couple of them possibly being sacrificed to the altar of science.

Ingredients


(I've put Amazon links at the end of the article if you want to buy any of these.)

Water is water.  Use distilled to be on the safe side.

The surfactants mentioned were dish soap and Kodak Photo-Flo, one of the chemicals used in the process of developing film.  Because I like taking the complicated route, I went with Photo-Flo.

Humectant recommendations were vegetable glycerin and propylene glycol.  I went with propylene glycol because why not.

Dye is the tricky part.  There are a lot of options, but I ended up going with powdered aniline dye because I had prior experience with it from some woodworking a couple years back.  So far I've tried Keda Aniline Wood Dye and Jacquard Acid Dye; I also have some Jacquard Procion MX Dye in the mail, as I saw it recommended for being very permanent (at least on cellulose-based paper, which is most of them).

The two biocides I saw recommended were salicylic acid and phenol.  Seems like salicylic acid is safer and phenol is more effective.  I bought some phenol but haven't even opened the bottle yet because I'm still just figuring out colors and the oldest batch I have is under a week old right now, so not at any real risk of mold.  (In terms of safety I'm not overly concerned; the quantities involved are very low and the main risks seem to be around the levels of chronic exposure experienced by people working with phenol in an industrial capacity.)

Recipes


Base solutions


First, I made 1% and 10% solutions of both propylene glycol (PG) and Photo-Flo (PF).  Trying to add either of these undiluted into an ink will be incredibly difficult due to the minuscule amounts you need.

I also made a batch of solvent that's just 10ml of distilled water with 0.1 ml each of the 1% PG and 1% PF solutions, for diluting inks to make them lighter.

Color bases


For all of the dyes I got, I made a base solution of 1 gram of dye powder and 10ml of water.  The only exception is the Keda black dye; it appears much stronger and 1 gram wouldn't come close to completely dissolving in 10ml of water, so I diluted it out to 40ml.

For the Keda dyes, I mixed in 0.1ml each of the 1% PG and 1% PF solutions for each 10ml of ink (so, 0.4ml for the black).  I also made a batch of solvent that's just 10ml of distilled water with 0.1 ml each of the 1% PG and 1% PF, for diluting to soften the colors.

For the Jacquard dyes, I didn't include the PF/PG, and instead added that to the final inks I made.

There's no reason not to use these as inks in their unmixed form, but where's the fun in that?  I only have a pen using the black ink right now, the rest are just for mixing together.  Here's what they all look like (note: the ones that say "glass pen" were written with a glass dip pen instead of a proper fountain pen, which results in more ink being deposited on the page and so a darker shade resulting), as well as the recipes used for each one (although I also wrote it above).


Mixed batches


So far I've made 5 different mixes that I've liked enough to load into fountain pens and assign ID numbers for easy future reference.  The specific recipes are in the image; I could write them out here but you need to look at the picture anyway to see what color it is so it doesn't seem necessary.


All together now


Here are all 12 inks, in one photo for slightly easier comparison.


Is it worth it?


Complicated question.  If you factor in the theoretical value of my time, absolutely not, but that's partly because I'm hilariously overpaid and partly because I spend way too long doing things like this more precisely than is really necessary.  If you just look at the ingredients involved, and don't include the nice-to-have-but-not-strictly-required lab tools I list below, and if you accept the not-yet-certain assumption that this ink won't, in fact, destroy my Lamy 2000 (which sure as hell won't see this ink until it's worked properly in a cheap pen for at least a couple months with no issues), then possibly.

Let's calculate a rough per-unit cost.

You can get 25 grams of the Keda black dye for $18 on Amazon.

A liter of distilled water costs $0.25 (well, a gallon costs about a dollar). 

Using the formula I did, you'd also need 0.1 milliliters each of Kodak Photo-Flo and propylene glycol.  If you're making this in bulk, that amortizes out to nothing.  Let's assume you're making a single batch but can buy a tiny bit off some local photo shop or something for a couple bucks.  (If you're making a single batch and can't split the cost, it's annoyingly expensive, but let's be optimistic.) 

That works out to about $20 per liter, which sounds expensive if you aren't used to fountain pen ink, but the comparably-colored Diamine Gray runs $250 per liter (the comparable Noodler's Lexington Gray is cheaper, at $140 per liter, but in general Noodler's inks are fairly slow-drying (albeit waterproof once dry), so I find them inconvenient to use in most cases).
(In fairness, the other dye colors are more expensive per unit volume than the black, due to the higher concentration required, but the ones I've made end up maxing out at $80 per liter.)

So, cheaper than real ink, but if you want a dark/vibrant color, only by a factor of 2-4.  That's certainly something, but not really a great reason to make your own.  It makes a lot more sense if you're the sort of person who wants to get really involved in every aspect of your hobbies, or if you're really, REALLY particular about ink colors.  Also if you're looking for one of those hobbies that can be self-supporting with a bit of luck, as you may be able to sell ink to fellow pen enthusiasts and possible make enough to cover at least the cost of the ingredients (I definitely wouldn't count on using this to pay the bills, though).

Complete ink journal


This is the entirety of the journal I'm using to keep track of the ink work (minus the two pages already shown above with the index of color IDs and recipes).  Gives a bit of an idea the sort of trial and error that goes into figuring out what inks look good.









Possibly-useful links


Links for learning stuff


Reddit thread about DIY ink.

FountainPenNetwork thread about DIY ink.

FountainPenNetwork thread about ink biocides.

If anyone has other good links to info on DIY ink, I'm happy to add it here.  Likewise, I'll update this list with other useful ones I encounter.


Links for buying stuff


Disclaimer: the Amazon links are affiliate links, so I get a tiny pittance if you buy stuff through them, but they're all stuff I used in making my own ink and not random things I'm trying to get people to buy.  (My integrity absolutely has a price and can be bought, but it costs way more than the $7.30 I've made in the last two years through affiliate links.)

The non-Amazon links are not affiliate links and I get nothing from you using them, apart from the knowledge that I sent business to places that have treated me well in terms of customer service in the past, and that's honestly worth more than the tiny trickle of affiliate earnings.

Ingredients


Keda aniline wood dye - Good starter set, but I might go straight to Jacquard dyes were I doing this again.

Jacquard crimson and Jacquard brilliant blue - Cheaper to order it from Blick Art Supplies, but I'm lazy.

Photo-Flo - you might be able to get this locally if there's a good photography store in town, or, as some places have mentioned, you can just use detergent instead.  However, if you want to go with this stuff, here's where I got it.  If you know anyone else who's interested in making ink, you can definitely split a bottle; I've used maybe 25 microliters (no, not milliliters) of this stuff so far making 60-70ml of usable ink.  If I were less busy I'd offer to sell small bottles of it, but it's really not worth my time.  That said, if anyone else wants to, I'm happy to update this with a link to their site.

Propylene Glycol - you can likely get this locally, as well, but I'm lazy.  As with the PF, I've used maybe 25 microliters of this stuff so far.  Also as with the PF, I'm too busy to sell tiny bottles, but am happy to advertise someone else who does if anyone wants the job.

Tools


The following aren't strictly required but have been incredibly useful for making/testing inks.  Also, for all of the lab equipment, it's worth looking for either friends with lab access who can get it for you at a discount or free, or places that are selling it cheaply due to it not being in good enough condition for use in a proper lab (e.g., I got the disposable plastic dishes mentioned below for $6 instead of $20 because it was a returned and opened-but-resealed box).

Milligram-precision scale for measuring ingredients - possibly overkill, but overkill is underrated and I already had this lying around.  If you don't care about exactly recreating a color down the road, may as well just trust the manufacturer's word for how much dye is in each jar you buy and mix it all with water from the get-go. 

Disposable pipettes for rough ink measurement - you can probably just get a couple and clean them out, but to be on the safe side, disposable ones aren't all that expensive and you can always save them and wash them, just less frequently than if you only have a dozen.  Could also just use drinking straws.

Syringes with dull needles for precise ink measurement - especially useful for measuring the PF and PG solutions.  I usually use the pipettes for ink if I need to measure by the drop (for testing mixes) or the ml (for making full batches), but these are useful for getting .1ml or .05ml of a liquid.  You can also get syringes and needles pretty cheaply at CVS or the grocery store (at least, you used to be able to), but they're a) sharp and b) more expensive if you want a large number.  Also, the person at the counter will look at you suspiciously even after you explain that they're for filling fountain pens and not shooting up heroin.

Disposable plastic dishes for mixing colors - if you're opposed to disposable products, definitely don't get these, but I find them way easier than having a half-dozen or a dozen tiny dishes to wash out after each time I make inks, and I was able to pick up a returned box for a bit over 1 cent per dish.  Cheaper but slightly more labor-intensive alternative is folding tiny dishes out of tin foil.

Plastic bottles for storing samples - these are identical to the ones used by GouletPens.com for ink samples, just with blue lids instead of white.  I already had them lying around as I use them to hold painkillers while running; there may be other, better alternatives available, especially if you've gone through a lot of commercial ink and have saved the bottles.

Marking tape for marking samples - you could probably also use masking tape or just write directly on the vials, but I've found this sticks better long-term than other tapes while still being removable.  However, it has no adhesion whatsoever to certain types of glass used in small sample vials.

Rack for holding samples - this appears to be the same one sold by GouletPens.com, but for about half the price and with Prime shipping.  If you want to support an awesome specialty retailer, buy it from Goulet.  If you're like me and already regularly spend far too much money at Goulet and want to save a couple bucks this time, buy it from Amazon.

Jinhao 993 Shark pens for testing inks - $1.50 each for a dozen fairly solid pens.  Only downside is that they're a bit inconsistent in terms of ink feeding; that said, it's still cheaper to try out an ink in 3 sharks than pretty much any pen that's better.  You can get them cheaper on AliExpress and possibly elsewhere, but this is cheap enough and much faster than waiting for delivery from China.

Monday, February 26, 2018

reMarkable e-ink tablet first impressions

I normally stay away from first-generation devices (let's be honest, I normally hold off several years longer than most people when it comes to adopting a new type of device).  However, ever since the reMarkable e-ink tablet kickstarter a while back, I've been intrigued.  I prefer paper for most types of writing, mostly due to the seamless free-formness of it.  I don't need to switch between programs and learn new toolsets to add a sketch to the margin if it helps illustrate what I'm writing about, whereas doing the same on a computer takes a lot of work.  That said, if I then have to access those notes elsewhere, I'd better have the right notebook handy.  Or, if I need to share a page with someone, I need to either type it up or scan it, both of which are annoying.  Having everything automatically digitized was enticing enough that when the reMarkable went on a bit of a sale last week, I bit the bullet and purchased one.  As they're still somewhat sparse in the wild, I figured I'd write up my first impressions to help other people decide if they want one.

Pictures


I've uploaded pictures of unboxing and setup, as well as examples of writing/drawing, to Imgur at https://imgur.com/a/lKOxJ

Physical build


In terms of physical build, I'm a bit torn.  I like the overall shape and setup, but the different plastic vs. metal components feel a bit weird.  I'm not used to the front of a device being only plastic, but then again, I'm not used to using a tablet at all, so it may just be that I'm not used to the overall form factor.  The one specific complaint is that I wish the back were a bit less smooth.  It would feel slightly better to hold with one hand with a bit more texture, but I can't see myself using it in that manner frequently anyway so it's a bit of a moot point.  I expect the main ways I'll hold it are sitting flat on a table or other surface, held with two hands, or, if I need to write and don't have a flat surface, held with my hand at the top and the bottom against my torso.

Related to its holdability/use, I can't see myself taking this thing anywhere without a case like I frequently do with my laptop.  The exposed screen and plastic construction make that seem too risky, although it does sound from what I've read like it's a fairly sturdy device.  In addition, it seems far better suited to a folio-style case than a slide-in-and-out sleeve.  There aren't any reasonably-priced ones being made specifically for the reMarkable yet, but there are some generic ones that work just fine.  I'm currently using a fake-leatherbound-book-style case that was $12.99 on Amazon, and it works pretty well.  A couple minor complaints (I wish the cover could fold all the way around back more easily, and the storage pockets are fairly useless if you don't want to risk scratching the screen) but overall it does what it needs to.  The spine is a bit thicker than I'd like, but I expect that's part of how it protects the tablet screen while stored flat.  If I need to toss the tablet in my backpack and space is at a premium, I also got a basic padded canvas sleeve that fits it perfectly.  Not the heaviest-duty case available, but it was $10.99 and purple, so why not.  It'll get use, but not as much as the book-style one.

Usage


When viewing a document on my laptop while writing on it on the tablet, a single line took ~10 seconds to appear on the computer screen after being drawn.  However, that requires the WiFi to be on, which has a pretty big hit on the battery life.

I loaded a handful of PDFs onto the device using the Mac OSX client.  First one (2 pages) opened without a problem and I was able to read it, annotate it, etc.  I had what's been my only crash so far opening the second, a 23-page paper with a couple graphs and images.  When I went to open it, the screen loaded a fuzzy version of the first page, hung for 5-10 seconds, and then the device spontaneously restarted.  When I reopened the paper, it opened fine.

I've used the tablet for the last three days for reading papers and journalling, and it's working great.  In terms of reading, it's the closest thing to actual paper that I've used.  I have a Kindle Paperwhite lying around somewhere, but I haven't used it in ages because I prefer physical books for book-length stuff (probably 90-95% of my reading) and because the small screen and inability to write get in the way of reading PDFs of academic articles (the other 5-10%).  Because the reMarkable screen is nearly the size of a piece of paper, it feels fine to read PDFs that I'd otherwise have to print out and then keep track of.

I want to write more about using it, as this section feels short, but in a way that's a good sign.  I usually journal in an A5 unlined paper journal.  However, since getting started with the reMarkable on Friday, I haven't touched it (literally, in fact).  I'm sure it'll still get use when I want something smaller/less fragile or I'm feeling old-school, but the fact that I didn't use it at all over the weekend is a pretty strong indicator to me that the reMarkable is (for my use cases) a fairly solid paper replacement.

Programming


I've played around a bit with the cloud API, and the code I've written is up at https://github.com/stevenorum/remarkable.  The only two existing client libraries I've found are PHP (which I hate) and Go (which I don't have the time to learn right now due to focusing on C/C++), so I've started tossing together a very basic one in Python3 (specifically doesn't work with Python2, so get with the times or GTFO).  Doesn't do much useful, but lets you pull down the files stored in the cloud service.  I may or may not keep building on this as time permits.

Random annoyances/feature requests


(Important note: Don't let the length of this list obscure the fact that I really like this tablet and am glad I got it; it just isn't (yet) perfect for my specific uses and tastes.)

  • Battery life is still a bit iffy, but more annoyingly it doesn't provide specific info about how much life is remaining.  It did prior to the latest update, but apparently that was frequently inaccurate.  Give the user better info about the state of the battery and what different features can be disabled to improve it.
  • I wish there were an easy way to turn off WiFi and have it not turn back on unless I explicitly tell it to.
  • A bit more complex but also useful, I wish there were some way to tie WiFi syncing to tablet usage.  For example, if I open the tablet and it's been more than a couple hours since I last used it, turn on the WiFi momentarily to check.  Likewise, if I've made changes to documents (more than just changing what page I'm reading), temporarily enable WiFi and sync when I put the tablet to sleep.
  • Let me explicitly sync a document from the device to the cloud.  Sometimes it seems to take a while with no apparent reason why.
  • Something a bunch of other users have mentioned but that bears repeating, but provide a better interface for taking notes while reading a document.  Either some sort of landscape split-screen, with a document on one side and notepad on the other, or some button that lets you temporarily pop up a notepad to jot something down.
  • Smarter text integration when reading documents.  Being able to select text in a PDF and have that stored as the actual text selected instead of a line on the page would be great.  For example, create a reference page for each PDF that gives you the selected text snippets with links to those parts of the documents.  That'd make it way more useful for reading and reviewing documents.
  • Improve the document move interface.  When I've selected the documents and click move, having a full directory tree displayed (potentially with some tapping to enable expansion or scrolling necessary, if there are a lot of directories) instead of the usual filesystem navigation interface would be both more intuitive (at least for me) and require far fewer taps for most moves.
  • Make the unlock screen keypad bigger or (ideally) configurable size-wise.  It feels a little small, given the screen space available and the fact that the device is meant to be used edge-to-edge.  It's the same size as the unlock pin pad on my phone, even though that's designed for a much different usage type (one-handed with one finger instead of an entire hand that isn't tethered to the tablet).
  • Related to the above, make it possible to connect an external keyboard.  Being able to type and then easily add in sketches would be great for getting project ideas down onto (e-)paper.
  • Having the option to not display thumbnails in the file list menu would let it be about twice as dense, which is preferable for stuff like a directory of academic articles where the thumbnails don't provide any value.
  • Having the option to cut off the margins of PDFs would be awesome.  When I'm reading a paper that's formatted in the normal two-column layout, there's 1/2 inches on each side and at least 3/4 inches at each of the top and bottom that's wasted whitespace  It can be useful for making notes, but most of the time when reading it's just making the text a little smaller with no benefit.  The case of the tablet is already designed to look like the margins.
  • Not sure if this is closer to an issue with reMarkable or an issue with Preview on Mac OSX, but the PDF exports don't show the shading of the writing/drawing that a user adds.  However, the PNG exports do, and if you export as a PDF and then have Preview export that PDF to PNG, the shading returns, so it''s probably an issue with Preview.


Slightly more pie-in-the-sky requests



  • Put a different type of tip on the back end of the stylus that can be recognized separately and treated as an eraser.
  • Providing a command line directly on the device (instead of just letting you shell in from another computer) would be understandably difficult and risky but still undeniably cool.
  • I know the challenges involved in reliable handwriting recognition and so I don't expect handwritten-to-text conversion anytime soon.  However, it'll be cool when it arrives, either provided by reMarkable or some third-party add-on.


Monday, November 6, 2017

NaMeCryReNoWriMo, day 6: Reading Signal blog posts

Today's studying started with reading on the Signal Protocol but quickly devolved into reading signal.org blog posts

Signal protocol


https://en.wikipedia.org/wiki/Signal_Protocol

"The protocol combines the Double Ratchet Algorithm, prekeys, and a triple Diffie–Hellman (3-DH) handshake, and uses Curve25519, AES-256 and HMAC-SHA256 as primitives."


Double Ratchet Algorithm


Key management algorithm.
Manages renewal and maintenance of short-lived session keys while providing forward secrecy.  Based on Off-the-Record Messaging and Silent Circle Instant Messaging Protocol.

Signal blog


https://signal.org/blog/contact-discovery/
TL;DR There is no theoretically secure way to let a user find which of their contacts are using Signal that prevents Signal from being able to see their social graph.  We're just writing the code to not store that info and giving users the choice between trusting us and opting out of contact discovery.

https://signal.org/blog/private-contact-discovery/
TL;DR In order to let a user find which of their contacts are using Signal without giving Signal a way to store that data, the lookup is done inside a sever-side Intel SGX enclave running verifiable code in a way that ensures the host machine can't get insight into the social graph through memory access patterns.

https://signal.org/blog/safety-number-updates/
https://signal.org/blog/there-is-no-whatsapp-backdoor/
https://signal.org/blog/doodles-stickers-censorship/
https://signal.org/blog/giphy-experiment/
https://signal.org/blog/signal-android-attachment-bug/
https://signal.org/blog/facebook-messenger/
https://signal.org/blog/advanced-ratcheting/ <- Need to read this one again to fully grok all of it.

??? Read https://signal.org/docs/


Sunday, November 5, 2017

NaMeCryReNoWriMo, day 5: [redacted]

Today's studying has been playing around with HSMs.  Unfortunately, that means it's a bunch of writing code for the product on which I work, so I need to run it by legal folks before I post it publicly.

Stay tuned for your regularly scheduled meandering crypto refresher notes tomorrow.

Saturday, November 4, 2017

NaMeCryReNoWriMo, day 4: Feistel ciphers and HSMs

Feistel cipher


https://en.wikipedia.org/wiki/Feistel_cipher

Class of block cipher.  Encryption/decryption very similar, so less to implement.  Made up of a "round" that gets performed many times.

Encryption:
(For each round)
The input is either the plaintext or the result from the previous round.  Split it into two halves, Li (Left input) and Ri (Right input).
Lo (Left output) is Ri.
Ro (Right output) is Li xored with F(Ri, Kr), where F is the underlying function of the cipher and Kr is the key for that specific round.
And now Lo and Ro are Li and Ri for the next round.

Decryption:
(For each round)
The input is the ciphertext, made up of Lo and Ro.  We want to find Li and Ri.
Ri is just Lo.
Li is Ro xored with F(Ri, Kr) [and because of above, remember that F(Ri, Kr)==F(Lo, Kr)]
And Li and Ri are Lo and Ro from the previous round, so this can be repeated back down to the plaintext.

L and R are not always balanced; Skipjack is moderately unbalanced, while the Thorp Shuffle has L as a single bit.

Most block ciphers are based on either Feistel ciphers or substitution-permutation networks (or potentially a mix of the two?)

RC4


https://en.wikipedia.org/wiki/RC4

Stream cipher.  Functions by performing repeated permutations on a block of 256 bytes and after each permutation outputs a number that's a combination of a couple of them.
Does not use a nonce, only a key.
Exact algorithm is on the wikipedia page.  I don't feel like rewriting it.
Basic weakness is that the first couple bytes of the output leak information about the initial generating key.  Key recovery attacks can be performed with millions or billions of messages (wikipedia calls out 2^26 and 2^34)
"The use of RC4 in TLS is prohibited by RFC 7465 published in February 2015."

PKCS #11


https://en.wikipedia.org/wiki/PKCS_11
http://docs.oasis-open.org/pkcs11/pkcs11-base/v2.40/errata01/os/pkcs11-base-v2.40-errata01-os-complete.pdf
http://wiki.ncryptoki.com/introduction-to-pkcs-11-specifications.ashx
https://www.opendnssec.org/softhsm/

Standardized programming interface for cryptographic operations.  Cryptoki API communicates with crypto stuff via "slots".  Slots map to crypto "tokens", usually hardware devices specifically built for cryptographic operations.  (It's possible to have a software-backed slot for testing, though, such as with the SoftHSM linked above.)  Each token can have a number of data blobs, keys, and certificates, and supports various operations on each depending on data type and configuration settings.

FIPS 140-2


https://en.wikipedia.org/wiki/FIPS_140-2
http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.140-2.pdf

Federal Information Processing Standard 140-2: Security Requirements for Cryptographic Modules
Defines four different levels of security for hardware+software cryptographic modules.  1 is weakest, 4 is strongest.

According to wikipedia:

Level 1: It can do some form of approved crypto.

Level 2: Has physical tamper-evident features and/or pick-resistant locks.

Level 3: Has features that wipe the crypto material in the module if it's tampered with.

Level 4: Like level 3, but to the extreme.  Must either be shown to be unaffected by abnormal environmental conditions or, if it is affected, to wipe crypto material before it's compromised.

I seem to recall there also being software components to FIPS 140-2...
Yep, wikipedia leaves out a lot of the NIST guidelines.  Good job, guys.

Level 1: Basically as shown on wikipedia.

Actually, it appears that whoever wrote the wikipedia article just grabbed the first couple sentences from each of the level summaries in the NIST doc, and as the summaries start with the physical requirements, that completely omits the interesting stuff.

Level 2: Also requires role-based auth to perform crypto operations, and requires that the hardware module be used from a computer meeting Common Criteria Protection Profiles (Annex B) and meeting CC EAL2. (whatever that means)

Level 3: Also requires identity-based auth, not just role-based.  Input or output of plaintext critical security parameters (passwords, keys, etc) must travel through physically or logically isolated ports.  Must meet Annex B plus a Trusted Path (the isolated ports thing), as well as EAL3.

Level 4: Also needs to meet EAL4.

(If it wasn't clear, Level n+1 also needs to meet all of the requirements for Level n.)

New questions/topics:

??? Pseudorandom function?  Pseudorandom permutation?  Pseudorandom generator?
??? Skipjack, key escrow, and general government chicanery
??? Thorp Shuffle? http://web.cs.ucdavis.edu/~rogaway/papers/thorp.pdf
??? Substitution–permutation network?
??? eSTREAM?
??? Play around with PKCS #11 a bit and get some code up.
??? Identity-based auth vs. just role-based?
??? Common Criteria Protection Profiles, Annex B?  EAL2/3/4?

Friday, November 3, 2017

NaMeCryReNoWriMo, day 3: Salsa20/ChaCha20 and Poly1305

Not as much today as I'd like because I actually had work to do, but I may add more later tonight if I have nothing else to do.

Salsa20


Stream cipher.  Uses a 256bit key, 64bit nonce, and 64bit stream index, and generates a 512-bit block.  Has the unusual (for a stream cipher; block ciphers in counter mode also have this) property that any given block in the keystream can be generated in constant time.

Basic bit operations are xor, 32-bit addition mod 2^32, and left-rotation.

Each 512-bit stream block is made by performing those on a 512-bit block consisting of the key (256), the stream position (64), the nonce (64), and a constant value (128).  These blocks are a total of 16 "words" (which means 32 bytes when talking about binary data) arranged in a 4x4 matrix.  Each word w0 is mutated by combining it with two other words from the block and one of the numbers 7, 9, 13, or 18, depending on its placement.  I'm still having trouble figuring out exactly how the  matrix is constructed.

20 rounds of this operation are performed for standard Salsa20, although weaker (and faster) versions use fewer rounds.


ChaCha20


Variant of Salsa20 with a different combination of bit operations used in each round.

Google (and others, I assume?) uses it, along with Poly1305, as a replacement for RC4 in TLS.

For ChaCha20, the 16-word matrix is constructed from 4 constant words, the 8 words of the key, one word of a position counter (as that's enough for 256GB of data), and 3 words of nonce, in that order.


Poly1305


Message authentication code. 16-bytes for any-sized message.  Uses two 128-bit keys r and s.

The basic algorithm is as follows:
-Break the message into 16-byte blocks.  For each block:
--Append one more bit 1.  If it's a 16-byte block this is equivalent to the block plus 2^128, but if it's the last block it may be smaller and so be equivalent to some other number 2^(8k) (where k is a positive integer)
--Pad the block with 0s to 17 bytes.
--Add this to the result from the previous message block (add 0 if it's the first block)
--Multiply this by r (first key). (this makes it a polynomial function, hence the first half of the name)
--take this mod p, where p = (2^130)-5 (this is where the second half of the name comes from)
-Once you've done this for every block, add s, and then the 128 LSBs form the MAC.
(Random note: all the byte stuff above is done little-endian)


Camellia


As far as I can tell, it's pretty similar to AES until you dive down into the specific details of the substitutions/permutations involved.


ARIA


Feistel cipher like AES.  Uses 2 S-boxes.  One is the same as AES, the other is taken from the binary representation of pi.


Links:


Camellia:
https://en.wikipedia.org/wiki/Camellia_(cipher)

Salsa20/ChaCha20:
https://en.wikipedia.org/wiki/Salsa20
https://cr.yp.to/snuffle/spec.pdf
https://tools.ietf.org/html/rfc7539

Poly1305:
https://en.wikipedia.org/wiki/Poly1305
https://cr.yp.to/mac.html
https://cr.yp.to/mac/poly1305-20050329.pdf
https://tools.ietf.org/html/rfc7539


New topics:


??? RC4
??? Feistel cipher