• Breaking News

    Tuesday, May 18, 2021

    Python library for simulating finite automata Computer Science

    Python library for simulating finite automata Computer Science


    Python library for simulating finite automata

    Posted: 17 May 2021 09:31 PM PDT

    Hello everyone, am a computer science student, actually I'm working on a python library for simulating finite automata, on this library implemented DFA and NFA, and also is able to visualize the automata, convert NFA to DFA, and also implemented the product and union of DFA.

    Here is my repo: https://github.com/rohaquinlop/automathon

    and the PyPI page: https://pypi.org/project/automathon/

    On the documentation you can see my email and my other media where you can contact me if you want, I'm making this library because I think that automata are really fun, and I want to help others when they are starting to learn about them.

    submitted by /u/fexx3l
    [link] [comments]

    Algorithmic Comic Book Lettering - A Surprisingly Hard Optimization Problem

    Posted: 17 May 2021 09:05 AM PDT

    Word wrapping - choosing how to partition a paragraph of text into lines - is a classic computer science problem, with a well-known solution due to Donald Knuth and Michael Plass.

    While their algorithm can quite flexibly handle a wide variety of scenarios, I believe I've found a variation of this problem that it cannot deal with. I can't find any relevant literature, and I've been having a hard time trying to solve it myself, so I'm hoping to find some new insight here.

    To recap, we consider a paragraph of text to be a heterogeneous sequence of glyphs (each having a fixed width) and spaces (having a "natural" width, though they may be stretched or squashed), and we want to find a subsequence of only spaces, which will be replaced by line breaks.

    We can immediately see that, for a sequence containing n spaces, there are 2n ways to break it up, most of which are nonsensical, violate some external constraints, or both. From the rest, we want to choose the one that's most aesthetically pleasing, or at least close enough to be inoffensive.

    For most prose text, we want each line to be shorter than some maximum width l, or more generally, we want line i to be shorter than l[i] (e.g. to fit text around an illustration). Subject to this constraint, we want to minimize both the total white space to the right of each line (i.e. sum(i in 0..N) l[i] - L[i]) and the raggedness of the paragraph (i.e. sum(i in 1..N) |L[i] - L[i - 1]|), and we can use dynamic programming to do it. In practice, you want to sum squared differences and add some other penalties, but that's the gist of it.

    Comic book typography is an entirely different beast, however. Specifically, I'm referring to dialogue, laid out in word balloons; I'm disregarding sound effects here. Sure, for a given balloon, you could construct a sequence of desired line lengths, use the DP algorithm, and be done with it, but even that's tricky when you could fit the same text into a variable number of lines. Really, though, you want to fit the balloon to the text it contains, not the other way around.

    And with that, the concept of a desired line length as input goes out the window. We're still subject to the constraint of a maximum line length (the width of the panel/page), but we don't really care about using any specific portion of it. Rather, we want the overall contour of the layout to conform to an oval shape, and we don't want that oval to be too eccentric. You can look at this infographic by Nate Piekos to see what I mean.

    We'll need a new metric to optimize, and, at least intuitively, it seems clear we'll also need a different algorithm to do it - I'm not so sure it's even possible to solve this with dynamic programming at all, but more on that later.

    So, given a sequence of n line lengths L (which we can easily derive from a proposed sequence of breakpoints), and assuming a constant line height h, we can easily construct a sequence of vertically adjacent rectangles, centered around the origin, here given as a sequence of 4-tuples of corner points:

    Y := nh/2, R[i] := ( (-L[i]/2, ih - Y), ( L[i]/2, ih - Y), (-L[i]/2, (i + 1)h - Y), ( L[i]/2, (i + 1)h - Y) ) 

    We can find the superellipse, also centered at the origin, with minimal area, that fully encloses these rectangles - I'll sketch out an algorithm for this further down, if you're interested, but I don't think it's all that relevant to the discussion. We can then calculate its area, and subtract the total area of the rectangles to obtain a measure of negative space.

    (We use a superellipse, i.e. the set of points defined by |x/a|^m + |y/b|^m = 1, because a proper ellipse (where m = 2) just looks a bit too "skinny"; we'll assume a fixed value of m that is greater than 2.)

    Minimizing this quantity should naturally penalize asymmetric layouts, as well as those where a line closer to the center is shorter than one further away from it, and do so with appropriate weights. An additional penalty for high eccentricity is also required, something like ab - (min {a,b})^2 with a configurable weight.

    Hopefully, you can see my issue: there's no obvious way to decompose the line-breaking problem into sub-problems such that we can meaningfully reuse their solutions. How good a proposed solution for a subsequence is depends on what the rest of the sequence looks like, and that's going to hold whether you're going forwards, backward, or from the middle outwards. And I think this is intrinsic to the problem, rather than just an issue with this metric since every other metric I could think of was the same in this regard. (I am open to suggestions on how to improve this as well, though.)

    I have an algorithm sketched out that can find a near-optimal placement for up to three line breaks in linear time, but I don't see how it would generalize. Maybe it could serve as the basis for some kind of recursive divide-and-conquer scheme, but again, I'm not sure how that would break down.

    Anyway, it goes like this:

    • In a linear pass through the sequence, find the breakpoint that is closest to the visual midpoint; let's call its index j. This is obviously the optimal point to break if it were the only line break, but we'll evaluate the metric anyway for later comparisons.
    • In a second pass using two indices i, k going from the outside inwards:
      • evaluate the metric for both [i, k] and [i, j, k]*, and remember the tuple/ triple if it scored better than the previous best,
      • advance i or k to the next breakpoint, whichever will leave the first and last line closer in length; terminate if either is longer than the line from i to k.
    • Return one of [], [j], [i, k], [i, j, k], depending on which scores best.

    *: Once we've seen [i, k] perform better than [i, j, k], it should be safe to assume we've already seen the optimal placement of three breaks, so we can stop evaluating the metric for that.

    This doesn't account for the maximum line width, of course, but that's a straightforward extension.

    Honestly, this seems just barely good enough for my use case, but it is pretty limiting. I'd like to find the optimum using any number of line breaks less than a configurable maximum, but I wouldn't be surprised to learn that's just not possible in polynomial time. In that case, I'd be happy with an approximate algorithm to do the same.

    Any ideas on how to improve on this would be greatly appreciated!

    Side Note: Fitting a Superellipse Around a Stack of Rectangles

    Let P be the set of corner points of the rectangles. Iterate over all pairs of points in P, find the semi-axes of the superellipse going through them, and check if it contains all the other points. If it does, calculate its area, and remember only the smallest one seen so far.

    This has cubic time complexity in the number of points, but we can use a set of points that is smaller by a factor of 4, by exploiting some symmetries:

    • By definition, the corner points are symmetric about the y-axis (forall (x, y) in P: (-x, y) in P), so we can discard all points with x < 0.
    • While the corner points need not be symmetric about the x-axis, their projection onto the y-axis is, i.e. forall (x1, y) in P: (x2, -y) in P. From each such pair of points, we need only keep the one where the x-coordinate has greater absolute value since a superellipse going through the other one will never include it.

    We can shave this down further by throwing out points that aren't part of the convex hull of P.

    That's only an improvement on the constant factors, but there's an approximate, linear-time algorithm we can fall back on for larger n, that also gives visually acceptable results:

    • Pick any point in P and find the semi-axes of the superellipse with minimal area going through it (which will generally be of the form a = k*x, b = k*y, where k depends on the exponent m used to define the superellipse).
    • Iterate over the remaining points; for each point (p_x, p_y) outside the superellipse, stretch one of its radii so that the point lies on the perimeter:
      • calculate the half-width w of the superellipse at y = p_y,
      • calculate the half-height h of the superellipse at x = p_x,
      • if |p_y|/h < |p_x|/w set b := b * |p_y|/h, otherwise set a := a * |p_x|/w.
    submitted by /u/teryror
    [link] [comments]

    No comments:

    Post a Comment